π§Functions Guide
Functions enable custom data transformations through Python code that can be reused across multiple validation jobs for consistent data processing.
Overview
Data transformation functions are powerful tools that preprocess and modify column data before validation or comparison. Reuse functions across jobs to maintain consistency and standardize data handling.
Perfect for:
π§Ή Data cleaning and normalization
π Custom transformation logic
π Format standardization
π‘οΈ Data masking and anonymization
β Validation rule enforcement
Key Benefits
Reusability
Create once, use across unlimited jobs
Consistency
Standardized transformations everywhere
Maintainability
Update function in one place
Performance
Optimized, compiled transformations
Flexibility
Custom Python logic for any transformation
Function Types
Transform individual column values
Use for:
Text normalization
Date/time formatting
Value calculations
Type conversions
Clean and validate data
Use for:
Removing invalid characters
Null/empty value handling
Outlier detection
Data quality checks
Combine multiple values
Use for:
Multi-field calculations
Complex derivations
Statistical operations
Date/time computations
Mask and anonymize sensitive data
Use for:
PII masking
Data anonymization
Sensitive field protection
Compliance requirements
Quick Navigation
Creating Functions β Build and deploy new transformation functions
Using Functions β Apply functions to your validation jobs
Function Development Workflow
Common Use Cases
Email Normalization
Applied to: Customer email validation and comparison
Date Standardization
Applied to: Cross-source date field alignment
Phone Number Normalization
Applied to: Contact information validation
Currency Conversion
Applied to: Financial data comparison
Function Best Practices
Writing Effective Functions:
β Keep functions focused on single transformation
β Handle null/None values gracefully
β Add input validation for type safety
β Use descriptive names and docstrings
β Test with edge cases and sample data
β Document expected input/output formats
β Optimize for performance with large datasets
Function Constraints & Limits
Function Size
64 KB
Maximum code length
Execution Time
5 seconds
Per value timeout
Memory
512 MB
Process memory limit
External Calls
Restricted
Network calls not allowed
File Access
Not allowed
No filesystem access
Debugging Functions
Test Your Function Locally
View Function Logs
Check execution logs when applied in jobs to debug transformation issues.
Handle Errors Gracefully
Performance Tips
Avoid nested loops
Reduces O(nΒ²) complexity
Use built-in functions
Faster than custom loops
Cache lookups
Avoid repeated calculations
Batch processing
Process multiple values together
Pre-compile regex
Speed up pattern matching
Related Documentation
FAQ
Q: Can I use external libraries? A: Standard Python libraries are supported. Third-party packages must be pre-approved.
Q: How many functions can I create? A: Unlimited functions. You can organize them by purpose or data domain.
Q: Can I edit a function after deploying? A: Yes, you can edit functions anytime. Changes apply to new job runs.
Q: What if my transformation fails? A: Implement error handling to return default values or null.
Last updated