As a Data Quality Engineer, you will be responsible for building automated, repeatable, efficient processes to perform data quality checks across various data assets. It is your duty to ensure high quality data is available to internal stakeholders and customers.
To do this you will be designing, developing, and documenting automated (sometimes manual) test plans and test cases writing high-quality, well-structured code. The output of these quality assurance checks should be easy to disseminate to appropriate audiences so that the proper action can be taken to ensure consistent, high-quality data across the organization.
You will be involved with requirements gathering, working closely with other teams such as Data Engineering and effectively communicating or explaining technical concepts to developers, product managers, and business partners. You will provide ongoing support of this data quality framework along with continuous improvement and refinement.
MAJOR JOB RESPONSIBILITIES
Contribute to the design, development, documentation, maintenance, and monitoring of a data quality framework.
Build repeatable, automated, efficient data quality check.
Continuously validate the data quality across data pipelines and repositories against data from source systems.
Work across teams for requirements gathering.
Document test plans and test cases.
Execute test cases, perform bug tracking, document, and share results.
Troubleshooting, performance tuning and resolution where necessary.
Assist with data quality support tickets and inquiries.
Design Data Quality reports and dashboards for various audiences to analyze and
communicate the output of the data quality tests.
EDUCATION / QUALIFICATIONS / EXPERIENCE
B.S. in computer science or information systems fields required, or 5+ years related work experience.
Strong analytical, critical thinking skills used to solve complex problems.
Strong technical background with a mix of development and automation skills.
Outstanding attention to detail and consistently meets deadlines.
Exceptional communication and interpersonal skills.
Ability to work alongside a highly collaborative team, but also a self-starter, able to work.
independently with little guidance.
Experience in troubleshooting, performance tuning, and optimization.
Proficient in shell scripting, Python, Scala or other programming languages.
Knowledge of Spark/PySpark.
Excellent SQL knowledge, ability to read/write SQL queries.
Skilled in Hive (HQL) and HDFS.
Experience working with both unstructured and structured data sets, including flat files, JSON,