πŸ”§Functions Guide

Functions enable custom data transformations through Python code that can be reused across multiple validation jobs for consistent data processing.

Overview

Data transformation functions are powerful tools that preprocess and modify column data before validation or comparison. Reuse functions across jobs to maintain consistency and standardize data handling.

circle-check

Key Benefits

Benefit
Impact

Reusability

Create once, use across unlimited jobs

Consistency

Standardized transformations everywhere

Maintainability

Update function in one place

Performance

Optimized, compiled transformations

Flexibility

Custom Python logic for any transformation


Function Types

Transform individual column values

Use for:

  • Text normalization

  • Date/time formatting

  • Value calculations

  • Type conversions


Quick Navigation


Function Development Workflow


Common Use Cases

Email Normalization

Applied to: Customer email validation and comparison


Date Standardization

Applied to: Cross-source date field alignment


Phone Number Normalization

Applied to: Contact information validation


Currency Conversion

Applied to: Financial data comparison


Function Best Practices

circle-info

Writing Effective Functions:

  1. βœ… Keep functions focused on single transformation

  2. βœ… Handle null/None values gracefully

  3. βœ… Add input validation for type safety

  4. βœ… Use descriptive names and docstrings

  5. βœ… Test with edge cases and sample data

  6. βœ… Document expected input/output formats

  7. βœ… Optimize for performance with large datasets


Function Constraints & Limits

Constraint
Limit
Details

Function Size

64 KB

Maximum code length

Execution Time

5 seconds

Per value timeout

Memory

512 MB

Process memory limit

External Calls

Restricted

Network calls not allowed

File Access

Not allowed

No filesystem access


Debugging Functions

Test Your Function Locally

View Function Logs

Check execution logs when applied in jobs to debug transformation issues.

Handle Errors Gracefully


Performance Tips

Optimization
Benefit

Avoid nested loops

Reduces O(nΒ²) complexity

Use built-in functions

Faster than custom loops

Cache lookups

Avoid repeated calculations

Batch processing

Process multiple values together

Pre-compile regex

Speed up pattern matching



FAQ

Q: Can I use external libraries? A: Standard Python libraries are supported. Third-party packages must be pre-approved.

Q: How many functions can I create? A: Unlimited functions. You can organize them by purpose or data domain.

Q: Can I edit a function after deploying? A: Yes, you can edit functions anytime. Changes apply to new job runs.

Q: What if my transformation fails? A: Implement error handling to return default values or null.

Last updated