About this chapter
This chapter describes how to build crosstabs.
Cross tabulation is a useful technique for analyzing data. By presenting data in a spreadsheet-like grid, a crosstab lets users view summary data instead of a long series of rows and columns. For example, in a sales application you might want to summarize the quarterly unit sales of each product.
In PowerBuilder, you create crosstabs by using the Crosstab presentation style. When data is retrieved into the DataWindow object, the crosstab processes all the data and presents the summary information you have defined for it.
An example
Crosstabs are easiest to understand through an example. Consider the Printer table in the PB Demo DB. It records quarterly unit sales of printers made by sales representatives in one year. (This is the same data used to illustrate graphs in Working with Graphs)
Rep |
Quarter |
Product |
Units |
---|---|---|---|
Simpson |
Q1 |
Stellar |
12 |
Jones |
Q1 |
Stellar |
18 |
Perez |
Q1 |
Stellar |
15 |
Simpson |
Q1 |
Cosmic |
33 |
Jones |
Q1 |
Cosmic |
5 |
Perez |
Q1 |
Cosmic |
26 |
Simpson |
Q1 |
Galactic |
6 |
Jones |
Q1 |
Galactic |
2 |
Perez |
Q1 |
Galactic |
1 |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
Simpson |
Q4 |
Stellar |
30 |
Jones |
Q4 |
Stellar |
24 |
Perez |
Q4 |
Stellar |
36 |
Simpson |
Q4 |
Cosmic |
60 |
Jones |
Q4 |
Cosmic |
52 |
Perez |
Q4 |
Cosmic |
48 |
Simpson |
Q4 |
Galactic |
3 |
Jones |
Q4 |
Galactic |
3 |
Perez |
Q4 |
Galactic |
6 |
This information can be summarized in a crosstab. Here is a crosstab that shows unit sales by printer for each quarter:
The first-quarter sales of Cosmic printers displays in the first data cell. (As you can see from the data in the Printer table shown before the crosstab, in Q1 Simpson sold 33 units, Jones sold 5 units, and Perez sold 26 units—totaling 64 units.) PowerBuilder calculates each of the other data cells the same way.
To create this crosstab, you only have to tell PowerBuilder which database columns contain the raw data for the crosstab, and PowerBuilder does all the data summarization automatically.
What crosstabs do
Crosstabs perform two-dimensional analysis:
-
The first dimension is displayed as columns across the crosstab.
In the preceding crosstab, the first dimension is the quarter, whose values are in the Quarter column in the database table.
-
The second dimension is displayed as rows down the crosstab.
In the preceding crosstab, the second dimension is the type of printer, whose values are in the Product column in the database table.
Each cell in a crosstab is the intersection of a column (the first dimension) and a row (the second dimension). The numbers that appear in the cells are calculations based on both dimensions. In the preceding crosstab, it is the sum of unit sales for the quarter in the corresponding column and printer in the corresponding row.
Crosstabs also include summary statistics. The preceding crosstab totals the sales for each quarter in the last row and the total sales for each printer in the last column.
How crosstabs are implemented in PowerBuilder
Crosstabs in PowerBuilder are implemented as grid DataWindow objects. Because crosstabs are grid DataWindow objects, users can resize and reorder columns at runtime (if you let them).
Import methods return empty result
A crosstab report takes the original result set that was retrieved from the database, sorts it, summarizes it, and generates a new summary result set to fit the design of the crosstab. The ImportFile, ImportClipboard, and ImportString methods can handle only the original result set, and they return an empty result when used with a crosstab report.
There are two types of crosstabs:
-
Dynamic
-
Static
Dynamic crosstabs
With dynamic crosstabs, PowerBuilder builds all the columns and rows in the crosstab dynamically when you run the crosstab. The number of columns and rows in the crosstab match the data that exists at runtime.
Using the preceding crosstab as an example, if a new printer was added to the database after the crosstab was saved, there would be an additional row in the crosstab when it is run. Similarly, if one of the quarter's results was deleted from the database after the crosstab was saved, there would be one less column in the crosstab when it is run.
By default, crosstabs you build are dynamic.
Static crosstabs
Static crosstabs are quite different from dynamic crosstabs.With static crosstabs, PowerBuilder establishes the columns in the crosstab based on the data in the database when you define the crosstab. (It does this by retrieving data from the database when you initially define the crosstab.) No matter what values are in the database when you later run the crosstab, the crosstab always has the same columns as when you defined it.
Using the preceding crosstab as an example, if there were four quarters in the database when you defined and saved the crosstab, there would always be four columns (Q1, Q2, Q3, and Q4) in the crosstab at runtime, even if the number of columns changed in the database.
Advantages of dynamic crosstabs
Dynamic crosstabs are used more often than static crosstabs, for the following reasons:
-
You can define dynamic crosstabs very quickly because no database access is required at definition time.
-
Dynamic crosstabs always use the current data to build the columns and rows in the crosstab. Static crosstabs show a snapshot of columns as they were when the crosstab was defined.
-
Dynamic crosstabs are easy to modify: all properties for the dynamically built columns are replicated at runtime automatically. With static crosstabs, you must work with one column at a time.
To create a crosstab:
-
Select File>New from the menu bar.
The New dialog box displays.
-
Select the DataWindow tab.
-
Select the Crosstab presentation style, then click OK.
-
On the Choose Data Source for Crosstab DataWindow page, specify the data you want retrieved into the DataWindow object.
For more information, see Defining DataWindow Objects.
-
In the Define Crosstab Rows, Columns, Values page, enter the definitions for the columns, rows, and cell values in the crosstab.
-
Click Next.
-
Choose Color and Border settings and click Next.
-
Review your specifications and click Finish.
PowerBuilder creates the crosstab.
-
(Optional) Specify other properties of the crosstab.
See Enhancing crosstabs.
-
Save the DataWindow object in a library.
You associate crosstab columns, rows, and cell values with columns in a database table or other data source.
To associate data with a crosstab:
If you are defining a new crosstab, the Define Crosstab Rows, Columns, Values dialog box displays after you specify the data source.
-
Specify the database columns that will populate the columns, rows, and values in the crosstab, as described below.
-
To build a dynamic crosstab, make sure the Rebuild columns at runtime check box is selected.
For information about static crosstabs, see Creating static crosstabs.
-
Click Next.
To define the crosstab, drag the column names from the Source Data box in the Crosstab Definition dialog box (or Wizard page) into the Columns, Rows, or Values box, as appropriate.
If you change your mind or want to edit the DataWindow object later, select Design>Crosstab from the menu bar and drag the column name out of the Columns, Row, or Values box and drop it. Then specify a different column.
The process is illustrated using the following dynamic crosstab. The columns in the database are Rep, Quarter, Product, and Units. The crosstab shows the number of printers sold by Quarter:
Specifying the columns
You use the Columns box to specify one or more of the retrieved columns to provide the columns in the crosstab. When users run the crosstab, there is one column in the crosstab for each unique value of the database column(s) you specify here.
To specify the crosstab's columns
-
Drag the database column from the Source Data box into the Columns box.
-
Using the printer example, to create a crosstab where the quarters form the columns, specify Quarter as the Columns value. Because there are four values in the table for Quarter (Q1, Q2, Q3, and Q4), there are four columns in the crosstab.
Specifying the rows
You use the Rows box to specify one or more of the retrieved columns to provide the rows in the crosstab. When users run the crosstab, there is one row in the crosstab for each unique value of the database column(s) you specify here.
To specify the crosstab's rows
-
Drag the database column from the Source Data box into the Rows box.
Using the printer example, to create a crosstab where the printers form the rows, specify Product as the Rows value. Because there are three products (Cosmic, Galactic, and Stellar), at runtime there are three rows in the crosstab.
Columns that use code tables
If you specify columns in the database that use code tables, where data is stored with a data value but displayed with more meaningful display values, the crosstab uses the column's display values, not the data values. For more information about code tables, see Displaying and Validating Data.
Specifying the values
Each cell in a crosstab holds a value. You specify that value in the Values box. Typically you specify an aggregate function, such as Sum or Avg, to summarize the data. At runtime, each cell has a calculated value based on the function you provide here and the column and row values for the particular cell.
To specify the crosstab's values
-
Drag the database column from the Source Data box into the Values box.
PowerBuilder displays an aggregate function for the value. If the column is numeric, PowerBuilder uses Sum. If the column is not numeric, PowerBuilder uses Count.
If you want to use an aggregate function other than the one suggested by PowerBuilder, double-click the item in the Values box and edit the expression. You can use any of the other aggregate functions supported in the DataWindow painter, such as Max, Min, and Avg.
Using the printer example, you would drag the Units column into the Values box and accept the expression sum(units for crosstab).
Instead of simply specifying database columns, you can use any valid DataWindow expression to define the columns, rows, and values used in the crosstab. You can use any non-object-level DataWindow expression function in the expression.
For example, say a table contains a date column named SaleDate, and you want a column in the crosstab for each month. You could enter the following expression for the Columns definition:
Month(SaleDate)
The Month function returns the integer value (1–12) for the specified month. Using this expression, you get columns labeled 1 through 12 in the crosstab. Each database row for January sales is evaluated in the column under 1, each database row for February sales is evaluated in the column under 2, and so on.
After you have specified the data for the crosstab's columns, rows, and values, PowerBuilder displays the crosstab definition in the Design view.
For example, to create the dynamic crosstab shown as the Dynamic crosstab example, you would:
-
Drag the quarter column from the Source Data box to the Columns box.
-
Drag the product column from the Source Data box to the Rows box.
-
Drag the units column from the Source Data box to the Values box and accept the expression sum(units for crosstab).
-
Select the Rebuild columns at runtime check box.
In the Design view, the crosstab looks like this:
Notice that in the Design view, PowerBuilder shows the quarter entries using the symbolic notation @quarter (with dynamic crosstabs, the actual data values are not known at definition time). @quarter is resolved into the actual data values (in this case, Q1, Q2, Q3, and Q4) when the crosstab runs.
The crosstab is generated with summary statistics: the rows and columns are totaled for you.
At this point, the crosstab looks like this in the Preview view with data retrieved:
Because quarter was selected as the Columns definition, there is one column in the crosstab for each unique quarter (Q1, Q2, Q3, and Q4).
Because product was selected as the Rows definition, there is one row in the crosstab for each unique product (Cosmic, Galactic, and Stellar).
Because sum(units for crosstab) was selected as the Values definition, each cell contains the total unit sales for the corresponding quarter (the Columns definition) and product (the Rows definition).
PowerBuilder displays the grand totals for each column and row in the crosstab.
Typically you specify one database column as the Columns definition and one database column for the Rows definition, as in the printer crosstab. But you can specify as many columns (or expressions) as you want.
For example, consider a crosstab that has the same specification as the crosstab in Viewing the crosstab, except that two database columns, quarter and rep, have been dragged to the Columns box.
PowerBuilder displays this in the Design view:
This is what you see at runtime:
For each quarter, the crosstab shows sales of each printer by each sales representative.
When you have defined the crosstab, you can see it with data in the Preview view.
To preview the crosstab:
-
If the Preview view is not open, select View>Preview from the menu bar to display the Preview view.
-
Click on the Preview view to be sure it is current.
-
Select Rows>Retrieve from the menu bar.
PowerBuilder retrieves the rows and performs the cross tabulation on the data.
Retrieve on Preview makes retrieval happen automatically
If the crosstab definition specifies Retrieve on Preview, retrieval happens automatically when the Preview view first displays.
-
Continue enhancing your DataWindow object and retrieve again when necessary to see the results of your enhancements.
When you have provided the data definitions, the crosstab is functional, but you can enhance it before using it. Because a crosstab is a grid DataWindow object, you can enhance a crosstab using the same techniques you use in other DataWindow objects. For example, you can:
-
Sort or filter rows
-
Change the column headers
-
Specify fonts, colors, mouse pointers, and borders
-
Specify column display formats
For more on these and the other standard enhancements you can make to DataWindow objects, see Enhancing DataWindow Objects.
The rest of this section covers topics either unique to crosstabs or especially important when working with crosstabs:
Crosstabs are implemented as grid DataWindow objects, so you can specify the following grid properties for a crosstab:
-
When grid lines are displayed
-
How users can interact with the crosstab at runtime
To specify the crosstab's basic properties
-
In the Properties view, select the General tab.
-
Specify basic crosstab properties.
The following table lists basic crosstab properties.
Option
Result
Display
On – Grid lines always display.
Off – Grid lines never display (columns cannot be resized at runtime).
Display Only – Grid lines display only when the crosstab displays online.
Print Only – Grid lines display only when the contents of the crosstab are printed.
Column Moving
Columns can be moved at runtime.
Mouse Selection
Data can be selected at runtime (and, for example, copied to the clipboard).
Row Resize
Rows can be resized at runtime.
When you initially define the crosstab, you associate the crosstab rows and columns with columns in a database table or other data source. You can change the associated data at any time in the Crosstab Definition dialog box.
To open the Crosstab Definition dialog box
-
Position the mouse below the footer band in the workspace and display the pop-up menu.
-
Select Crosstab from the pop-up menu.
The Crosstab Definition dialog box displays.
To modify the data associated with a crosstab
-
In the Crosstab Definition dialog box, fill in the boxes for Columns, Rows, and Values as described in Associating data with a crosstab.
-
Click OK.
Sometimes names of columns in the database might not be meaningful. You can change the names that are used to label rows and columns in crosstabs so that the data is easier to understand.
To change the names used in crosstabs
-
In the Crosstab Definition dialog box, double-click the name of the column in the Source Data box.
The New Name dialog box displays.
-
Specify the name you want used to label the corresponding column. You can have multiple-word labels by using underscores: underscores are replaced by spaces in the Design view and at runtime.
-
Click OK.
PowerBuilder changes the column name in the Source Data box and anywhere else the column is used.
Example
For example, if you want the product column to be labeled Printer Model, double-click product in the Crosstab Definition dialog box and specify printer_model in the New Name dialog box.
When the crosstab runs, you see this:
When you generate a crosstab, the columns and rows are automatically totaled for you. You can include other statistical summaries in crosstabs as well. To do that, you place computed fields in the workspace.
To define a column summary
-
Enlarge the summary band to make room for the summaries.
-
Select Insert>Control > Computed Field from the menu bar.
-
Click the cell in the summary band where you want the summary to display.
The Modify Expression dialog box displays.
-
Define the computed field.
For example, if you want the average value for a column, specify avg(units for all), where units is the column providing the values in the crosstab.
For example, this is a crosstab that has been enhanced to show averages and maximum values for each column. This is the Design view:
This is the crosstab at runtime:
To define a row summary
-
Select Insert>Control > Computed Field from the menu bar.
-
Click the empty cell to the right of the last column in the detail band.
The Modify Expression dialog box displays.
-
Define the computed field. You should use one of the crosstab functions, described next.
There are nine special functions you can use only in crosstabs: CrosstabAvg, CrosstabAvgDec, CrosstabCount, CrosstabMax, CrosstabMaxDec, CrosstabMin, CrosstabMinDec, CrosstabSum, and CrosstabSumDec.
These functions are listed in the Functions box when you define a computed field in a crosstab:
Each of these functions returns the corresponding statistic about a row in the crosstab (average, count, maximum value, minimum value, or sum). You place computed fields using these functions in the detail band in the Design view. Use the functions with the Dec suffix when you want to return a decimal datatype.
By default, PowerBuilder places CrosstabSum and CrosstabSumDec in the detail band, which returns the total for the corresponding row.
How to specify the functions
Each of these functions takes one numeric argument, which refers to the expression defined for Values in the Crosstab Definition dialog box. The first expression for Values is numbered 1, the second is numbered 2, and so on.
Generally, crosstabs have only one expression for Values, so the argument for the crosstab functions is 1. So, for example, if you defined sum(units for crosstab) as your Values expression, PowerBuilder places CrosstabSum in the detail band.
If you want to cross-tabulate both total unit sales and a projection of future sales, assuming a 20 percent increase in sales (that is, sales that are 1.2 times the actual sales), you define two expressions for Values:
sum(units for crosstab) sum(units * 1.2 for crosstab)
Here CrosstabSum returns the total of sum(units for crosstab) for the corresponding row. CrosstabSum returns the total for sum(units * 1.2 for crosstab).
For more information
For complete information about defining computed fields, see Enhancing DataWindow Objects.
For more about the crosstab functions, see the section called “Alphabetical list of DataWindow expression functions” in DataWindow Reference.
You can build a crosstab where each row tabulates a range of values, instead of one discrete value, and you can make each column in the crosstab correspond to a range of values.
For example, in cross-tabulating departmental salary information, you might want one row in the crosstab to count all employees making between $30,000 and $40,000, the next row to count all employees making between $40,000 and $50,000, and so on.
To cross-tabulate ranges of values
-
Determine the expression that results in the raw values being converted into one of a small set of fixed values.
Each of those values will form a row or column in the crosstab.
-
Specify the expression in the Columns or Rows box in the Crosstab Definition dialog box.
You choose the box depending on whether you want the columns or rows to correspond to the range of values.
-
In the Values column, apply the appropriate aggregate function to the expression.
Example
This is best illustrated with an example.
You want to know how many employees in each department earn between $30,000 and $40,000, how many earn between $40,000 and $50,000, how many earn between $50,000 and $60,000, and so on. To do this, you want a crosstab where each row corresponds to a $10,000 range of salary.
The first step is to determine the expression that, given a salary, returns the next smaller salary that is a multiple of $10,000. For example, given a salary of $34,000, the expression would return $30,000, and given a salary of $47,000, the expression would return $40,000. You can use the Int function to accomplish this, as follows:
int(salary/10000) * 10000
That expression divides the salary by 10,000 and takes the integer portion, then multiplies the result by 10,000. So for $34,000, the expression returns $30,000, as follows:
34000/10000 = 3.4 int(3.4) = 3 3 * 10000 = 30000
With this information you can build the crosstab. The following uses the Employee table in the PB Demo DB:
-
Build a crosstab and retrieve the dept_id and salary columns.
-
In the Crosstab Definition dialog box, drag the dept_id column to the Columns box.
-
Drag the salary column to the Rows box and to the Values box and edit the expressions.
-
In the Rows box, use:
int(salary/10000) * 10000
-
In the Values box, use:
count(int(salary/10000) * 10000 for crosstab)
For more on providing expressions in a crosstab, see Using expressions.
-
Click OK.
This is the result in the Design view:
This is the crosstab at runtime:
You can see, for example, that 2 people in department 400 and 5 in department 500 earn between $20,000 and $30,000.
Displaying blank values as zero
In the preceding crosstab, several of the cells in the grid are blank. There are no employees in some salary ranges, so the value of those cells is null. To make the crosstab easier to read, you can add a display format to fields that can have null values so that they display a zero.
By default, crosstabs are dynamic: when you run them, PowerBuilder retrieves the data and dynamically builds the columns and rows based on the retrieved data. For example, if you define a crosstab that computes sales of printers and a new printer type is entered in the database after you define the crosstab, you want the new printer to be in the crosstab. That is, you want PowerBuilder to build the rows and columns dynamically based on current data, not the data that existed when the crosstab was defined.
Occasionally, however, you might want a crosstab to be static. That is, you want its columns to be established when you define the crosstab. You do not want additional columns to display in the crosstab at runtime; no matter what the data looks like, you do not want the number of columns to change. You want only the updated statistics for the predefined columns. The following procedure shows how to do that.
To create a static crosstab
-
In the wizard page or in the Crosstab Definition dialog box, clear the Rebuild columns at runtime check box.
-
Define the data for the crosstab as usual, and click OK.
What happens
With the check box cleared, instead of immediately building the crosstab's structure, PowerBuilder first retrieves the data from the database. Using the retrieved data, PowerBuilder then builds the crosstab structure and displays the workspace. It places all the values for the column specified in the Columns box in the workspace. These values become part of the crosstab's definition.
For example, in the following screenshot, the four values for Quarter (Q1, Q2, Q3, and Q4) are displayed in the Design view:
At runtime, no matter what values are in the database for the column, the crosstab shows only the values that were specified when the crosstab was defined. In the printer example, the crosstab always has the four columns it had when it was first defined.
Making changes
You can modify the properties of any of the columns in a static crosstab. You can modify the properties of each column individually, since each column is displayed in the workspace as part of the crosstab's definition. For example, in the printer crosstab you can directly modify the way values are presented in each individual quarter, since each quarter is represented in the Design view. (The values are shown as units, units_1, units_2, and units_3.)
As with other DataWindow objects, you can specify property conditional expressions to modify properties at runtime. You can use them with either dynamic or static crosstabs. With dynamic crosstabs, you specify an expression once for a column or value, and PowerBuilder assigns the appropriate properties when it builds the individual columns at runtime. With static crosstabs, you have to specify an expression for each individual column or value, because the columns are already specified at definition time.
Example
In the following crosstab, an expression has been specified for Units:
The expression is for the Font.Weight property of the units column:
if (units > 100, 700, 400)
The expression specifies to use bold font (weight = 700) if the number of units is greater than 100. Otherwise, use normal font (weight = 400).
This is the crosstab at runtime:
Values larger than 100 are shown in bold.
For more information about property conditional expressions, see Highlighting Information in DataWindow Objects.