Pivot Transform Using C1 DataEngine
Data is the new Oil of the 21st century, and the amount of data generated every day is enormous and provides valuable information when analyzed in depth. Data can be used to learn a lot about a company, such as its revenue in a given amount of time or which sold the most. This analysis can further aid decision-making for sales forecasting, planning, marketing, and other business activities.
Data traditionally is stored in a tabular form where each row represents a single item of data. Usually, this raw data is very large and has a complex structure, and not very useful to us, so we need to display it in a summarized format, which we can easily ascertain what it represents.
For example, suppose a car selling company needs to gather information about which state has sold most cars. In that case, it will be better to aggregate each sale according to its respective state instead of looking at the data row by row. This type of summarization/aggregation is known as Pivot transform. In addition to aggregation, pivot transform can also perform sorting, filtering, grouping multiple datasets.
ComponentOne's C1DataEngine and C1FlexPivotEngine can be used to apply pivot transform on large sets of data. C1DataEngine offers fast data analysis by employing optimization techniques like column-oriented and memory-mapped file techniques to query data from large data sets. Pivots can be created over the resulting data using C1FlexPivotEngine.
We can display the queried data in many different transformation formats. The column/rows fields can be grouped according to other fields as per the requirements. This can help us to transform millions of data as per our business requirements. The data for C1DataEngine can be imported from various sources like SQL database, JSON files, CSV files, etc, and C1FlexPivotEngine can be connected to C1DataEngine easily.
Let's now delve into how we can use C1 Data Engine to apply various pivot transformations to data.
Using C1DataEngine in your project
C1DataEngine can be integrated into a .NET project very easily. The following NuGet packages are required for C1DataEngine:
- C1.DataEngine.Core
- C1.DataEngine.Core.Api
C1DataEngine is initialized using a Workspace class. When an object of this class is created, a folder is created in the project directory, which contains all the metadata, tables, queries, and results of the queries. The Init method of the Workspace class is used to initialize a workspace (or create a new one if not present).
this._workSpace = new Workspace();
this._workSpace.Init("workspace");
This will initialize the workspace Workspace. Now, we need to connect this workspace to a data source. For this article, we will use SQL Database:
SqlConnection conn = new SqlConnection(GetConnectionString());
conn.Open();
var command = new SqlCommand("Select * from Invoices", conn);
Now a connection is created, the DbConnector class can be used to connect this SqlConnection to this workspace:
var connector = new DbConnector(this._workSpace, conn, command);
connector.GetData("Invoices");
this.invoice = this._workSpace.table("Invoices");
Pivot over query
Now that workspace and data source are initialized, we can query the data engine on the Invoices table. Next, we will connect this data engine to C1FlexPivotEngine. Note that C1FlexPivotEngine requires the table or query name for initialization. So, we will first need to execute a query on the data engine, and then we can create a generic method that will return an instance of C1FlexPivotEngine initialized according to the query.
C1FlexPivotEngine CreateFlexPivotEngine(string queryName, string[] colFields, string[] rowFields, string[] valFields)
{
var fp = new C1FlexPivotEngine();
fp.Workspace = this._workSpace;
fp.ConnectDataEngine(queryName);
fp.BeginUpdate();
fp.ColumnFields.Add(colFields);
fp.RowFields.Add(rowFields);
fp.ValueFields.Add(valFields);
fp.TotalsBeforeData = true;
fp.EndUpdate();
return fp;
}
Here, the queryName param will be the name of the query (which is later explained) or the table name. The RowFields and ColumnFields are dimension fields using which the engine groups the data. The ValueFields are measure fields on which the aggregation is applied. They are typically numerical fields.
The C1FlexPivotEngine is initialized by providing the workspace and then connected to the data engine using the query name. Once the flex pivot engine is created, we can get the transformed data using the FlexPivotDefaultView property.
We will query the DataEngine only once and will fetch all the required columns along with aggregates. Then, we can use the C1FlexPivotEngine to apply different transforms to this single query result. In DataEngine, we can use the query method of the Workspace class.
This method takes an optional name parameter that stores the results using this name so we can use it in the future (the queryName parameter in the CreateFlexPivotEngine method). The second parameter is important. This is where we can provide information about how to query the data:
var queryName = this.QueryName;
var settings = new
{
invoice.Country,
invoice.ProductName,
invoice.Salesperson,
invoice.OrderDate,
Sales = Op.Sum(invoice.ExtendedPrice),
Quantity = Op.Count(invoice.Quantity),
QuantityTotal = Op.Sum(invoice.Quantity)
};
dynamic query = GetQuery(queryName, settings);
query.Query.Execute();
The query is executed, and the results are stored in the workspace folder for further use.
Single column pivot transform
In a single column pivot transform, the transform is applied only on a single column. This means that the data is aggregated on the basis of a single column only. From the Invoices table., if we need to know how much sales were done in each country, then we can apply single column pivot transform where we will group the data by country and add all the sales in that particular country.
var fp = this.CreateFlexPivotEngine(this.QueryName, new string[] { "" }, new string[] { "Country" }, new string[] { "Sales" });
IBindingList data = fp.FlexPivotDefaultView;
The data will be transformed according to the fields. The country column will be displayed in a column since it is added as a row field, and the aggregated sales data for each country will be displayed alongside each country on a different column.
We can also use single column transform to find out how many quantities of a single product were sold. For this, the data will be grouped on the Product column and aggregation can be applied on the Quantity column:
var fp = this.CreateFlexPivotEngine(this.QueryName, new string[] { "" }, new string[] { "ProductName" }, new string[] { "Quantity" });
IBindingList data = fp.FlexPivotDefaultView;
Multi-column pivot transform
As the name suggests, multi-column pivot transform refers to the type of pivot transform where the data is grouped by multiple columns. The top-level column is grouped first, then the column below this level, and so on. For example, in addition to total sales for a country, if we also wish to know the total sales of each product inside a particular country, then we can apply multi-column pivot transform with the country as the top-level column, then product and then summary of sales for each item:
var fp = this.CreateFlexPivotEngine(this.QueryName, new string[] { "Country" }, new string[] { "ProductName" }, new string[] { "Sales" });
IBindingList data = fp.FlexPivotDefaultView;
The resulting data will contain a column for each distinct product followed by aggregated sales of each country in multiple columns. Each of the distinct values of the country will have its own sales column.
We can also apply the transformation on a deeper level of columns. If we add a column for the name of the person who completed the sale, then the transform will be applied on 3 columns:
var fp = this.CreateFlexPivotEngine(this.QueryName, new string[] { "Country" }, new string[] { "ProductName", "Salesperson" }, new string[] { "Quantity" });
IBindingList data = fp.FlexPivotDefaultView;
Grouping on the basis of ranges in DateTime columns
Almost every dataset for a company has a date column. Pivot transform can also be performed on a date-type column. Usually, it is not feasible to group data on every single date because the datasets can be extremely large and may contain very old data. So, we can use C1FlexPivotEngine to aggregate dates on a single property basis, such as a year or month.
This will reduce the number of items returned significantly and will also provide better insight into the data. For example, we can find the total sales done in a particular year. For this, we will use the Format and RangeType property of pivot fields:
var fp = this.CreateFlexPivotEngine(this.QueryName, new string[] { }, new string[] { "OrderDate" }, new string[] { "Sales" });
var fld = fp.RowFields[0];
fld.Range.RangeType = RangeType.Format;
fld.Format = "yyyy";
IBindingList data = fp.FlexPivotDefaultView;
We can also perform a multi-column transform on this. Suppose we wish to divide the above sales into years and months. Then we will need to add two row fields for OrderDate and change the format of each field:
var fp = this.CreateFlexPivotEngine(this.QueryName, new string[] { }, new string[] { "OrderDate", "OrderDate" }, new string[] { "QuantityTotal" });
var year = fp.RowFields[0];
year.Range.RangeType = RangeType.Format;
year.Format = "yyyy";
var month = fp.RowFields[1];
month.Range.RangeType = RangeType.Format;
month.Format = "MMM";
month.Caption = "Month";
IBindingList data = fp.FlexPivotDefaultView;
Other operations
Apart from transforming data, C1DataEngine can perform other operations also like sorting and filtering. We will take a brief look at how these operations can be performed.
Sorting
C1DataEngine package exposes a DataList class that has a static method know as Sort. This method takes an IDataList, column name, and an optional ascending parameter and sorts the list according to the column name and ascending. Let's apply a descending sort on the Sales column in the SalesByCountry query.
var result = this._workSpace.GetQueryData("SalesByCountry");
DataList.Sort(result, "Sales", false); // sort result in descending order by sales column
Notice how I have used the GetQueryData method directly without creating the query object. This is because once the SalesByCountry query runs, the data engine stores the results in the workspace folder so that it can be used again even after restarting the application without querying the database again.
Filtering
Filtering can be performed by using the C1FlexPivotFitler class which is exposed by the Filter property of the C1FlexPivotField object. This class contains many properties to combine different filter types. For example, we can filter the ProductSalesByCountry to only display Brazil country.
var fp = this.CreateFlexPivotEngine(this.QueryName, new string[] { }, new string[] { "Country", "ProductName" }, new string[] { "Sales" });
var fld = fp.RowFields[0].Filter;
fld.Condition1.Operator = ConditionOperator.Equals;
fld.Condition1.Parameter = "Brazil";
IBindingList data = fp.FlexPivotDefaultView;
We can also use the NotEqualsTo operator instead of Equals operator in the above query to filter out Brazil and display all other countries:
var fp = this.CreateFlexPivotEngine(this.QueryName, new string[] { }, new string[] { "Country", "ProductName" }, new string[] { "Sales" });
var fld = fp.RowFields[0].Filter;
fld.Condition1.Operator = ConditionOperator.NotEquals;
fld.Condition1.Parameter = "Brazil";
IBindingList data = fp.FlexPivotDefaultView;
We can also combine different operators to perform complex filtering. For example, if we only wish to aggregate those items where Sales is less than 200 and greater than 100, then we can use a combination of GreaterThanOrEqualTo and LessThanOrEqualTo operators along with AndConditions property:
var fp = this.CreateFlexPivotEngine(this.QueryName, new string[] { }, new string[] { "Country" }, new string[] { "Sales" });
var fld = fp.ValueFields[0].Filter;
fld.Condition1.Operator = ConditionOperator.GreaterThanOrEqualTo;
fld.Condition1.Parameter = 100;
fld.Condition2.Operator = ConditionOperator.LessThanOrEqualTo;
fld.Condition2.Parameter = 200;
fld.AndConditions = true;
IBindingList data = fp.FlexPivotDefaultView;
This was a brief article on how we can use C1DataEngine and C1FlexPivotEngine for pivot transform. But this is just scratching the surface. More detailed explanation about transforms, sorting, etc., are provided in the documentation link. If you wish to read more about working on C1DataEngine, you can refer to this blog link. This blog provides more detailed information about each component of C1DataEngine as well as how you can use the command-line tool of C1DataEngine.
You can download the sample used in the blog from here.