How can I find running totals in the database? An exploration of data normalization.
I’m always asked, “how do I query to get the total PO value?” or “where’s the total line cost I see on the Order Tracker in the database?”. These are very valid questions, and the answer is not immediately intuitive if you have not worked with databases in the past. As you already know, it’s a little tricky in Epicor. In most cases, total values are not stored anywhere in the database, and this is a good thing. Knowing that, querying values becomes an exercise in calculation rather than a treasure hunt.
Why is it a good thing that totals are not stored in the database? It is much more efficient (and accurate!) for the database to calculate the total values based on the related records when the total value is needed. This ensures that the total is always based on the most up-to-date data, and it eliminates the need to constantly update two independent fields. For instance, if the POHeader stored the running total of all the POLines every line change would then have the overhead of also updating the field on the POHeader. It doesn’t sound like much, but over thousands of transactions with finite resources there is a bottleneck. Two moving parts are more likely to break than one. This comes back to what is referred to as data normalization, which is the process of organizing data in a database. There are a few different guidelines for normalization, known as “forms”, with the first three being the most common (there are five in total plus some special off-shoots). Edgar Codd, who is the inventor of the relational model, has the following goals listed for first normal form:
- 1. To free the collection of relations from undesirable insertion, update and deletion dependencies;
- 2. To reduce the need for restructuring the collection of relations, as new types of data are introduced, and thus increase the life span of application programs;
- 3. To make the relational model more informative to users;
- 4. To make the collection of relations neutral to the query statistics, where these statistics are liable to change as time goes by.
E.F. Codd, “Further Normalization of the Data Base Relational Model” p. 34
Modern databases are normally structured by the guidelines in the first three normal forms. Functionally, this means that our database is built by a series of unique table definitions which are joined by relationships. This gives us the ability to quickly query the data (which is good!), but means that we need to also do our own calculations and summaries for total fields and any other aggregated data (which is good because of the first point, but can be frustrating… it’s a very necessary evil for the integrity of the database). For a complete exploration into normal forms I’d recommend getting started with Microsoft’s KB Article 283878. Fear not, we have some tools in Epicor to tease out the data we want
The first place I always look is the field help. For many of the fields we can get the calculation here, so that removes a lot of the guess work. For instance if we look at the extended price on the PO line the field help shows us that it’s the supplier quantity times the unit cost divided by the price-per conversion. If you look at the technical details you’ll see that this field only has an EpiBinding, not a DB Field. That’s another good indicator that the field is calculated on the fly, the EpiBinding is the EpiDataView binding in the user-interface, it is abstracted from the data table and cannot be queried directly. Having the description of the calculated value is great, but it only gives us part of the equation. The second half comes in with how we want to go about summarizing the data. There are three methods we can use:
Method 1 – a summary table in a standard BAQ
Method 2 – an executive query for dimensional analysis
Method 3 – reporting (either Crystal Reports or SSRS)
The summary table method is nice if you need to summarize a small data set and it is very easy to do. To build a summary table in a BAQ (Epicor 9.05) you simply right-click on the table representation in the Phrase Builder and choose “Toggle IsSummary Flag”. The table representation will change to a darker green and a sigma symbol will display in the lower right. This means you cannot display the standard fields; however, you can create calculated fields using the different aggregate functions (average, count, maximum, minimum, total). For instance, if I wanted the total extended price across all lines I’d create a calculated field with the following equation (assuming all my cost-per’s are “E”): total(podetail.orderqty * podetail.unitcost) . Now in my BAQ results if I’ve linked POHeader with the summed PODetail I can show POHeader.PONum and then my calculated field for the summed extended cost across all lines.
Using executive queries gives you a much greater range of dimensional analysis. The executive process creates records in the mfgCube tables which represent the running totals across dimension pairs from a sourcing BAQ. Therefore, we could make a regular BAQ that had the PODetail values exposed (not aggregated) and make a calculated field for our extended price. Then in the executive query we’d define our dimension pairing as needed, for instance we could do by PONumber alone, or something more creative like Vendor versus Week to see the total amounts per Vendor through time. One tip I’d recommend is to add a calculated field to your source BAQ for “count” and set it equal to one. That way each record gets a weight of one in the summary. To use this, simply set an integer for you dimension pair field mapping to the “count” calculated field – you’ll have a running count of all records that went into that dimension pair.
Both methods are great for different things, but at the core they are letting you perform running total calculations either by table (summarized BAQ) or dimension pair (executive query). If you are working with Salesforce the summarized BAQ is exactly analogous to a roll-up summary field in a master-detail relationship. A final method is with SSRS or Crystal reports. Both reporting options let you create running totals on fields provided to the respective data sources. With Crystal the connection type is XML ADO.NET by default, this means you cannot execute SQL commands in the report. To cleanly display running total values with this limitation I make heavy use of sub-reports to suppress the details and only show my final calculated fields in the associated group footer.
I hope this gives you some insight into why the normally visible running total fields cannot be directly queried, and a few techniques for teasing out the data. This is not to say there are not total fields in the database you can use, some instances like AR Invoicing keep the current balance and amount on the header record, but you still need to integrate other areas like tax and miscellaneous charges.