Evaluate Job
An Evaluate Job performs comprehensive validations on a single data source to ensure data quality and integrity.
Validations Performed
Row Count Verification – Confirms the dataset contains the expected number of records.
Column Type Verification – Validates that each column's data type matches expectations.
Non-Null Columns Verification – Ensures specified columns contain no null values.
Duplicate Columns Verification – Detects duplicate records based on selected columns.
Regex Verification – Validates column values against regular expression patterns.
User-Function Verification – Applies custom business logic using Python functions (lambda or custom functions).
Note: Functions must return
TrueorFalse. Only Python language and built-in packages are supported.
Common Regex Examples
Email validation:
^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$US phone number:
^\(\d{3}\) \d{3}-\d{4}$Credit card number:
^\d{4} \d{4} \d{4} \d{4}$
Example: Custom Function
def validate_complex_data(data):
# Check if data is a string
if not isinstance(data, str):
return False
# Check length (assuming hash length of 64 characters)
if len(data) != 64:
return False
import re
# Validate hexadecimal format
hex_pattern = re.compile(r'^[0-9a-fA-F]{64}$')
if not hex_pattern.match(data):
return False
return TrueExample: Lambda Function
How to Create an Evaluate Job
Creating an Evaluate Job involves configuring a single data source and defining the validations to be applied during execution.
Select Evaluate Job as the job type.
Configure the data source details.
Choose the validations to apply (row count, schema, regex, functions, etc.).
(Optional) Attach custom functions for advanced logic.
Review the configuration and create the job.
Run the job and review results in Job History.
For a detailed, step-by-step guide on configuring and creating an Evaluate Job, refer to the documentation below:
Last updated