Blog

Pivoting Rows and Columns in .NET DataTableObject

21.04.2016
Aljoša Kocen

Say we were given a flat table of customer odrers. It may look like this:

Now say that we would like to know what is the cumulative value of orders placed by each customer ID in the range of dates. In MS Excel one would use QueryTable feature, to produce the following output:

This is called CrossTab report. It's main usage is when we need to present data in a report or show it in a grid on the screen.

We can see that not all customers placed orders every day. The report also gives us the sums for each day, and for each customer.

Building Cross tab reports in .NET

Having data in Excel, or txt or CSV format is not always the case. I came upon the situation where the data source was a legacy SQL Server Databaase. I could have used T-SQL PIVOT option.  I wanted a NET solution. The one I chose involves manipulating .NET DataTable rows and columns, to achieve desired output. I decided to implement the functionality as an Extension method, by extending existing DataTable object. I called the method Pivot, so instead of writing:

ResultsTable = Pivot(SourceTable)

We write:

ResultsTable=SourceTable.Pivot()

Pivot method will be supplied with four Arguments:

  • KeyColumnName: which presents the Key values of Pivot DataTable. In the case above, Date, is the KeyColumnname.
  • PivotColumnName: It is the column in the Source table, which presents the values to be pivoted into columns. In the example code above, CustomerID is the PivotColumnName.
  • ValueColumnName:  It presents the values to be manipulated accross the report. OrderValue in the example above is ValueColumnName in our sample above.
  • PivotAction:  Tells the Pivot method what type of manipulation is required by the method. The possible values are:
    • Sum: Find Total sum od ValueColumn based on KeyColumn and PivotColumn
    • Min: Find the minimum ValueColumn value based on KeyColumn and PivotColumn
    • Max: Find the maximum ValueColumn value  based on KeyColumn and PivotColumn
    • Count: Tells how many times has a certain combination of KeyColumn and PivotColumn occured in the SourceTable

The Implementation:

First we define the Pivot Action Enumeration:

	Public Enum PivotAction
		Sum = 1     'Sum up the data
        Min = 2     'Find Minimum Value
        Max = 3     'Find Maximum Value
        Count = 4   'Count Number of Occurences
    End Enum

And then we Define an Extension method itself:

	<Extension()>
	Public Function Pivot(ByVal SourceTable As DataTable,
								 ByVal KeyColumnName As String,
								 ByVal PivotColumnName As String,
								 ByVal ValueColumnName As String,
								 ByVal PivotAction As PivotAction) As DataTable

		'Results DataTable
		Dim ResultsData As New DataTable

		Dim Dv As DataView = SourceTable.DefaultView
		'Sort Source table by PivotColumnname
		Dv.Sort = PivotColumnName
		Dim keys(0) As DataColumn

		'KeyColumnName Values in ResultsData table must be unique
		Dim PrimaryColType As Type = SourceTable.Columns(KeyColumnName).DataType

		'Add KeyColumn in ResultData table
		ResultsData.Columns.Add(New DataColumn(KeyColumnName, PrimaryColType))

		'Loop through the rows in SourceTable and check PivotColumnName
		For i As Integer = 0 To Dv.Count - 1
			'If Column does not exist in the ResultsDataTable, add it
			If (ResultsData.Columns.Contains(Dv.Item(i).Item(PivotColumnName)) = False) Then
				ResultsData.Columns.Add(New DataColumn(Dv.Item(i).Item(PivotColumnName).ToString,
													   SourceTable.Columns(ValueColumnName).DataType))
			End If
		Next

		'Sort the SourceTable by KeyColumn
		Dv.Sort = KeyColumnName

		'Set PrimaryKey on The KeyColumn in ResultsTable 
		keys(0) = ResultsData.Columns(0)
		ResultsData.PrimaryKey = keys
		SourceTable = Dv.ToTable
		Dim dr As DataRow

		For i = 0 To SourceTable.Rows.Count - 1
			If (ResultsData.Rows.Find(SourceTable.Rows(i).Item(KeyColumnName)) Is Nothing) Then
                'If the DataRow with specified Column Does Not exist, we must add it
                dr = ResultsData.NewRow
				dr.Item(KeyColumnName) = SourceTable.Rows(i).Item(KeyColumnName)
				dr.Item(SourceTable.Rows(i).Item(PivotColumnName)) = SourceTable.Rows(i).Item(ValueColumnName)
				ResultsData.Rows.Add(dr)
			Else
                'We retrieve the DataRow in question and update/insert pivoted values
                dr = ResultsData.Rows.Find(SourceTable.Rows(i).Item(KeyColumnName))
				If dr.Item(SourceTable.Rows(i).Item(PivotColumnName)) Is DBNull.Value Then
                    'This is the first time we write to a cell
                    Select Case PivotAction
						'If PivotAction.Count, set value to 1
						Case PivotAction.Count
							dr.Item(SourceTable.Rows(i).Item(PivotColumnName)) = 1
						Case Else
							'Set the value to the value in the SourceTable cell
							dr.Item(SourceTable.Rows(i).Item(PivotColumnName)) =
								SourceTable.Rows(i).Item(ValueColumnName)

					End Select
				Else
                    'This is not the first time we write to a cell, so we update the values
                    'according to the PivotAction specified
                    Select Case PivotAction
						Case PivotAction.Count
							'If PivotAction.Count, increment by 1
							dr.Item(SourceTable.Rows(i).Item(PivotColumnName)) =
								dr.Item(SourceTable.Rows(i).Item(PivotColumnName)) + 1
						Case PivotAction.Max
							'If PivotAction.max, set value to Maximum of current value and the value in SourceTable
							If (SourceTable.Rows(i).Item(ValueColumnName) >
								 dr.Item(SourceTable.Rows(i).Item(PivotColumnName))) Then
								dr.Item(SourceTable.Rows(i).Item(PivotColumnName)) =
									SourceTable.Rows(i).Item(ValueColumnName)
							End If
						Case PivotAction.Min
							'If PivotAction.Min, set value to Minimum of current value and the value in SourceTable
							If (SourceTable.Rows(i).Item(ValueColumnName) <
								 dr.Item(SourceTable.Rows(i).Item(PivotColumnName))) Then
								dr.Item(SourceTable.Rows(i).Item(PivotColumnName)) =
									SourceTable.Rows(i).Item(ValueColumnName)
							End If
						Case PivotAction.Sum
							'If PivotAction.Sum, Add the value in the SourceDataColumn to the value in the ResultsTable
							dr.Item(SourceTable.Rows(i).Item(PivotColumnName)) +=
								SourceTable.Rows(i).Item(ValueColumnName)
					End Select
				End If
			End If
		Next

		'Clear DataView
		Dv = Nothing

		'Return Result
		Return ResultsData
	End Function

I documented the code heavily. What the method does is:

  1. Creates resulting Datatable.
  2. Assings SourceTable's KeyColumn as its KeyKolumn
  3. Loops through the SourceTable and adds Columns to ResultsData, according to the values found in the SourceTable PivotColumn
  4. Sorts the SourceColumn according to KeyColumName
  5. Loops through the SourceData and inserts/updates values in the ResultsData table

Testing the Code

I've created a sample app, which performs the generation of Dummy data and the pivot table creation. It will also save both, the source and results table into Excel Workbook, so one can always check the results by using Excel QueryTable Feature. You can download the full source code here. Do give it a try, see through it, and I, as always do wellcome all the suggestions, requests and positive critique through the contact form on this site.

Happy coding.