The source is a single table of pivoted data which will be referred to as ICOD_Main. This will allow for all data now and in the future to be loaded. ICOD_Main’s Data Structure:
- Unit_Id: This is a varchar that is unique to the object, usually a business key for flexibility and long term support
- Example would be XFMR1234 or CBKR231A
- Normalized_Data_Point: This is a vharchar that is the end-mapped name of the point
- Data_Point_Source_System: This is the source system that the data point came from
- Data_Point_Source_Table: This is the source system’s table that the data point came from
- Data_Point_Source_Point: This is the source system’s table’s column that the data came from
- Data_Point_Datetime: This is a full datetime column that has the approved date and time of the record
- Data_Value: This is a varchar(4000) that has the data point stored. This has all data types (except blobs/XMLs, etc) and the data is converted before and after storage based on the Normalized_Data_Point’s assumed data type
- Detail_Notes: This is also a varchar(4000) and is an overflow of data, because there’s times that you need a little more contextual information
There are a number of identified data blocks stored inside ICOD_Main:
- Test results: this is a single result
- Although this seems basic, this is the meat of the ICOD. This allows for quick cube analysis across all units, time, or sets of points
- Top level, single result: this is useful for reference information
- Here is going to be nameplate or current installation status information, with the idea that this data is not a slowly changing dimension. This is a 1:1 dataset.
- Event flags: Useful as an indicator of something happening in history.
- Example is Converting logic of a qualitative event into a quantitative point.
When loading data into ICOD_Main, there’s a number of processing steps:
- Load what you know (automated)
- Go get test data. It’s simple and easy.
- Load more complex data points, these may require remapping or recalculation from the test data.
- Make a ledger
- Store data that comes in from a pipeline that isn’t automatically ELT’d from a source.
- Preprocess anything you think would be useful.
When fetching data for analysis, there’s a few logical steps. Please note, I’m going to reference a @rundate. This variable is to allow you to snap back in time, so you would see what the results were back then:
- Fetch snapshot data based on the last data points before your @rundate. This equation is a bit tricky, as it’s based on the DB you’re using.
- MAX(Data_Point_Datetime) over (partition by Unit_Id, Normalized_Data_Point) as Last_Normalized_Data_Point_Datetime. Then put that in a subquery and filter with a WHERE clause that matches the datetimes, returning the last one. This is annoying to set up, but powerful when done.
- The important thing is to focus on the scope of data you’re hunting for. Take the data from the main ICOD table and store it into a smaller set.
- Pull what data you need for trending.
- These would be <= @rundate
- The analysis for these would be done before storing them into your reporting analysis.
- In Tri-State’s analysis, we do some post-processing looking for specific indicators, and have those as health delimiters.
- Pull your reference data.
- Wrap up all the data.
- The output really doesn’t matter too much, but the functionality to get to the end is the main point.