Using the Visual Query Builder Workbench App for DataEngine
The C1DataEngine is an in-memory data manager that enables extraction, transformation and loading (ETL) for large data sets. You can import data directly from SQL Server or any enumerable collection of strongly-typed objects such as OData. From there, C1DataEngine enables you to analyze large amounts of unstructured data, merge disparate data sources or simply load millions of records in a second or less using its smart caching technology.
C1DataEngine is not a UI component — it’s a cross-platform .NET Standard library entirely programmed within code (C#). In the 2019 v3 release, we introduced a helpful tool known as the Workbench app. In this post, I will show you how you can get started using C1DataEngine, visually through the Workbench app, and how you can use Workbench as a standalone data analysis tool.
Introducing the Workbench App
Workbench is a standalone, desktop app that supports the creation and visualization of C1DataEngine workspaces.
A workspace, in DataEngine terms, is a folder that stores all metadata, base tables, and query results. In code, this folder is represented by an instance of the Workspace class. But in the Workbench app, a workspace is a literal folder on your desktop that will contain these things. Each workspace folder is managed by a metadata.xml file that is generated by the Workbench app. This file handles the memory-mapped files for C1DataEngine’s column-oriented data storage model. But that’s more than you need to know.
What’s the purpose of the Workbench App?
For developers who are using the C1DataEngine library in their .NET Core applications, this tool provides the following benefits:
- Import external data without coding. Supported data formats include SQL Server, CSV, and JSON.
- Create and execute queries without coding using an intuitive visual designer.
- Generate JSON strings that can be used to create queries programmatically using the C1DataEngine API library.
- Examine base tables and query result sets created by any application that uses C1DataEngine.
- Visualize query result sets using a variety of chart components.
- Remove unwanted query result sets from existing C1DataEngine workspaces.
Workbench can also be used for code-free data analysis. Let’s see how this is useful next.
Using Workbench for Free Data Analysis
C1DataEngine Workbench is also useful as a standalone tool for data analysis. No programming knowledge is required. Essentially, Workbench is most useful as a visual query builder as it can generate query strings that can be used in a variety of other applications or within your code. For example, you can use Workbench to analyze JSON data files you may have lying around. Plus, Workbench is a free app, and it can be used on Windows, Linux, or Mac.
You can download the standalone Workbench Electron app here.
If you are reading this far in the future, you should download the ComponentOne Service Components to get the latest version of Workbench.
Getting Started using the Workbench App
For a quick tutorial on using the Workbench app, check out the following topics from the documentation:
Getting Started with C1DataEngine Workbench.
Or you can watch this video below. It walks you through how to do a few basic tasks such as creating a base table, executing a query, and visualizing the results.
Additional topics for filtering data and joining queries can also be found in the documentation. Next, I’ll walk through importing a JSON file and creating a filtered query for analysis.
Loading and Analyzing a Local JSON File
Let’s see how we can use the filtering capability to analyze a data set to answer some pressing questions. Consider that I have this raw data file that includes information about pet licenses in the Seattle, Washington area.
My JSON data set looks like this:
To load this into Workbench, I will first create a new workspace folder. I can browse to the location of the JSON file on my computer and select that folder as the workspace.
You can follow along using this same data file installed with C1DataEngine here:
\Documents\ComponentOne Samples\DataEngine\CS\PetLicenses\data
Next, I import the file records as a new base table named “PetLicences.” This is done by choosing JSON as the provider and providing the local file path (leave the JSON Path with the default value and skip the Command tab). See the graphic below:
Now I have imported the base table for my JSON file. Next, I can analyze it inside Workbench.
Filtering a Data Set using C1DataEngine Range Expressions
My data set includes licenses for all cats and dogs in Seattle during 2018. The license record includes information about the type of pet and the pet's name.
Some questions I may ask from this data set include:
- Which pet species has more licenses, cats, or dogs?
- How many licenses were issued by calendar month?
- What were the ten most popular dog names?
Let’s build a visual query to answer each of these questions.
Q: Which pet species have more licenses, cats or dogs?
Steps:
- Create a new Query and select PetLicenses for the Table.
- Select the Species and AnimalsName columns. Optionally, rename the 2nd column “Total” (really you can choose any column here because we are doing a Count operation) and select Count operation.
- Name the Query and click Create.
Now I can view the query results in a grid or chart. I now know the answer to my question is dogs!
And to my surprise, I learn that there are more than just cats and dogs licensed in Seattle (goats and pigs can be pets?). I can edit the table to display it as a chart instead by using the menu options.
Q: How many licenses were issued by calendar month?
Steps:
- Create a new Query and select PetLicenses for the Table.
- Select the Date and LicenseNumber columns.
- Set the Date column’s operation to “Month”. This provides special date/time aggregation. Note: you may need to click to another row before the date/time operations appear in the dropdown.
- Set the LicenseNumber column’s operation to “Count”. Optionally, rename the LicenseNumber column to “Total” (really you can choose any column here because we are doing a Count operation).
- Name the Query and click Create.
I can best view this query result as a bar chart. I can learn from this query that the number of licenses trended upward throughout the year. It’s important to use the Count operation if we are looking for the total number of records. If I had chosen “Sum” then the LicenseNumbers would be added together and that value is meaningless.
Q: What were the ten most popular dog names?
Not only can we aggregate the data, but we can also filter and sort the data to visualize the Top N values that match some criteria.
Steps:
- Create a new Query and select PetLicenses for the Table.
- Select the Animal Name columns and add it twice.
- Set the second name column’s operation to “Count”. Optionally, rename the Animal’s Name column to “Count” (really you can choose any column here because we are doing a Count operation).
- On the Ranges tab, set a filter condition for Species Equals “Dog”. This will only count rows for Dogs.
- Name the Query and click Create.
I first need to create the query before I can apply a top N filter. View the query result in a grid and select “Edit” from the menu. In the property editor, I set the Row Limit to 10, Sort By to “Count” and Sort Order to “Descending”.
This displays the top 10 dog names in my data set.
C1DataEngine Conclusion
The true value of C1DataEngine is its in-memory cache management for large data sets. The Workbench app I have demonstrated here is a handy tool we built that uses the engine and allows you to play around with data and build queries in a visual manner.
Here are some additional resources to learn more about C1DataEngine: