Sunday, February 23, 2014

Alrighty... So it's been a bit since I last posted...

Alrighty...

So it's been a bit since I last posted on here and wow, we have covered quite a bit of material in the last few weeks.

It doesn't look like much when you list them out:
  • Advanced Star Schema Design
  • Data Quality Analysis
  • Dashboard Design and Analysis
  • Web Metrics
  • Google Analytics


But... these can get very detailed. In fact, you could probably spend are career just focusing on any of these.

Advanced Star Schema Design
So, now that we have submitted the homework and there has been a little time for the material to sink in, the design process is making more sense. Like I said before, transitioning mentally from trying to make everything 3NF and even 4NF to "just clump it all together" takes a lot of effort. Also, truly understanding what you are trying to model and get results for was a bit difficult for me when doing the homework. I think I was over-analyzing the process and just making it too difficult when a simple approach would have sufficed. Going through the homework was a great learning experience on “simplification.” Fortunately, I work in an environment where there is quite a bit of unprocessed data, and when I can finally take a break from school I'll look in to how I would design the schemas for ingesting in to a data mart/warehouse. Too bad I can only get to in on special networks.

Data Quality Analysis
That star design stuff leads right in to DQA. The premise is pretty much a no-brainer: garbage in = garbage out (the actual process on the other hand is not a quick wit response). The designs for collapsing all the operational data tables in to a "small" set of fact and dimension tables may be great, but if the data in those tables is not consistent, the reports are never going to produce valid results and will always be questionable. So before pushing the data, it should be analyzed for inconsistencies (data profiling) and cleaned up. This could be a very time-consuming process if it had to be done manually, even for relatively simple data sets. Fortunately, there are programs/tools that are designed specifically for this task. A good thing, especially when one considers that there are organizations will millions if not billions of pieces of data and the profiling and the cleaning process may need to be done numerous times before the resulting set is deemed a high enough quality for ingestion.

Then once the data is all tidied up and anomalies handled it can be loaded in to a data mart/warehouse. Cool! (yes, geeky)

Dashboard Design and Analysis
Now that all that operational data has been collapsed, cleaned up and loaded, it’s time to do something with it. What? This class is “Business Intelligence” so it makes sense that we would go over how to extract useful (intelligent) information from all that data and provide it to some business-type folks that can use the results for making decisions. This is where dashboards (and analysis of the information they display) come in to play.

The theory is, a dashboard should provide a quick-glance summary of some data set (facts and dimensions) and provide meaning to the business. It should be simple and not require any cross-referencing or lookups to understand. Not much different than a dashboard in a car, for example, speedometer, gas gauge, odometer, maybe engine temperature and battery charge. Granted, a business dashboard would have a little more, like graphs and summaries, but this is the general idea.

The key is figuring out exactly what data to pull for displaying. This is where a dashboard designer must not only understand the dashboard design tools and underlying data, but the target audience. In many cases, the same sets of data may need to be presented with different perspectives to accommodate for the audience focus. Network engineers may want to know who is utilizing the most bandwidth and which sites are being accessed most frequently, the finance department may only want to see the costs per user or department for the leased line the internet connection is coming through. And somewhere in the middle may be a manager who wants to see a combination to determine who is costing the company money compared to their productivity.

The visual piece of the dashboard that I just can’t get in to is scorecards. Got it, understand their use. I like to see them and use them. But designing them? I’m not the guy. It’s not that I can’t be creative or “visual” (heck, I’ve been doing photography as a hobby since I was a kid), but building pretty buttons and graphs for someone else isn’t my thing. Maybe it’s lack of exposure, experience or need. Who knows, maybe I’ll change my mind. I did with respect to MS SQL – swore up and down for years that I would never do databases. Now… I work with MS SQL, PostgreSQL and Oracle, so much so that I have been put on projects just because of my SQL experience. So there is hope J

[Hmmm. Re-read what I’ve written so far. If I didn’t know any better I’d say I knew what the heck I was writing about. It’s definitely a good thing that this doesn’t have to be an overly technical post. I do enough technical writing for work producing test and deployment plans and supporting documentation and that tends to be a bit dry. Nice to be able to write free-form for a little bit.

Now, time for a change of technical pace and a discussion of the latest topics we’ve covered.]

Web Metrics
In going through the reading material for this portion of the latest module, I’ve (re)learned that even within the IT community each segment has its own language. Exit and bounce rates, conversions, visitor metrics, demographics, order values and campaigns: this is obviously where business and sales has influenced IT. Brand new perspective for me. I’ve provided technical expertise in pre-sales, writing proposals and statements of work and executing contracts, but never had to work directly in this area. Fortunately, the concepts are easy to comprehend. There are endless examples on the web to look at in this context; virtually any company trying to get visitors to buy goods, download content or fill out information. Just like brick and mortar companies, it’s all about numbers and analysis: who is visiting, why are they visiting, what are they looking for, when are they visiting and where did they come from? Once there is an understanding of the answers, actions can be taken in terms of marketing; who to target, how to get their attention (and business) and when is the best time to do so.

Google Analytics
This portion of this latest module has been one of the more enlightening, dare I say fun, topics so far. Not completely sure why. Maybe it was because I was able to see all the previously taught material put together in a usable, coherent package. Maybe it’s that I've been allowed to view real data from a live system and slice, dice and drill down to my own interactions (I’m pretty confident that the site visit in January from Afghanistan was me and I actually captured my real-time activity, see image below). I was unable to convince any of my local businesses at home to let me access their Google Analytics data for my homework, but I (along with other students from the class) was granted permission to look at the MISonline data. I was initially unhappy with this as I wanted to be able to provide some sort of benefit to a company that I do business with, but in retrospect, looking at MISonline has been more beneficial to me. Being able to view how other people have interacted with the site and how my own interactions affect the cumulative data is much more educational.


Anyway, I think I have rambled enough about school. Although, I will point out that the next module covering social network analysis could be very interesting. I am presuming that mathematical modeling is somehow involved. We’ll see…

No comments:

Post a Comment