As the Enterprise Data and Analytics team establishes a data mart to drive predictive modelling and reporting, this role is vital to ensure trust in the data and insights we share to drive better, faster decision making. We need to establish, grow and automate our data health assurance capabilities. From monitoring to resolution to pro-active interventions
• Create queries to check success of each etl and raise alert if issue found
• optimize data checks to clearly identify and highlight issues and minimize confusion from alerts
• Proactively monitor for trends that may indicate future issues, for example:
• volume of data being processed by a specific job is growing over time.
• job is consistently taking longer to finish than it did in the past.
• tables are growing disproportionately larger over time.
• Continually measure frequency of failures per source and product
• Escalate to EDA developers the most frequently failing queries for optimization
• Establish simple user consumption based cost measurement for EDA GBQ projects
• Determine high cost EDQ queries and escalate to EDA analysts and developers as needed to optimize the queries
• Monitor key source systems and indexes to identify any potential problems or latency.
• Communicate to EDA team all issues that occur
• Determine broader comms plan with each EDA product owner
• Partner with EDA product owners to implement optimized solutions and checks
• Establish framework for the above to be rolled out efficiently to new sources and outputs
• Establish a dashboard for monitoring ETL jobs
• Transition manual data checks to automated, including intelligent thresholds for data check monitoring
Primary platform in scope is GBQ but also internal systems and their sources. Key tables will be prioritized by EDA leadership
Job applications are temporarily suspended. Please try again later.