# Functions Guide

> Functions enable custom data transformations through Python code that can be reused across multiple validation jobs for consistent data processing.

## Overview

Data transformation functions are powerful tools that preprocess and modify column data before validation or comparison. Reuse functions across jobs to maintain consistency and standardize data handling.

{% hint style="success" %}
Perfect for:

* 🧹 Data cleaning and normalization
* 🔄 Custom transformation logic
* 📐 Format standardization
* 🛡️ Data masking and anonymization
* ✅ Validation rule enforcement
  {% endhint %}

***

## Key Benefits

| Benefit             | Impact                                     |
| ------------------- | ------------------------------------------ |
| **Reusability**     | Create once, use across unlimited jobs     |
| **Consistency**     | Standardized transformations everywhere    |
| **Maintainability** | Update function in one place               |
| **Performance**     | Optimized, compiled transformations        |
| **Flexibility**     | Custom Python logic for any transformation |

***

## Function Types

{% tabs %}
{% tab title="Column Transformations" %}
**Transform individual column values**

```python
def transform_email(value):
    """Normalize email addresses"""
    return value.lower().strip()
```

Use for:

* Text normalization
* Date/time formatting
* Value calculations
* Type conversions
  {% endtab %}

{% tab title="Data Cleaning" %}
**Clean and validate data**

```python
def clean_phone_number(value):
    """Extract digits only from phone"""
    return ''.join(c for c in value if c.isdigit())
```

Use for:

* Removing invalid characters
* Null/empty value handling
* Outlier detection
* Data quality checks
  {% endtab %}

{% tab title="Aggregations" %}
**Combine multiple values**

```python
def calculate_age(birth_date):
    """Calculate age from birthdate"""
    from datetime import datetime
    today = datetime.today()
    return today.year - birth_date.year
```

Use for:

* Multi-field calculations
* Complex derivations
* Statistical operations
* Date/time computations
  {% endtab %}

{% tab title="Security & Privacy" %}
**Mask and anonymize sensitive data**

```python
def mask_credit_card(cc_number):
    """Mask credit card for security"""
    return f"****-****-****-{cc_number[-4:]}"
```

Use for:

* PII masking
* Data anonymization
* Sensitive field protection
* Compliance requirements
  {% endtab %}
  {% endtabs %}

***

## Quick Navigation

* [**Creating Functions**](/data-testing/functions-and-transformations/index/creating-functions.md) – Build and deploy new transformation functions
* [**Using Functions**](/data-testing/functions-and-transformations/index/using-functions.md) – Apply functions to your validation jobs

***

## Function Development Workflow

```mermaid
graph TD
    A["📝 Write Function Code"] --> B["⚙️ Test Locally"]
    B --> C["✅ Create in Platform"]
    C --> D["🧪 Test on Sample Data"]
    D --> E["📌 Save Function"]
    E --> F["🔄 Reuse in Jobs"]
    style A fill:#e1f5ff
    style C fill:#f3e5f5
    style E fill:#c8e6c9
    style F fill:#fff9c4
```

***

## Common Use Cases

### Email Normalization

```python
def normalize_email(email):
    """Convert emails to lowercase, trim whitespace"""
    return email.lower().strip()
```

**Applied to:** Customer email validation and comparison

***

### Date Standardization

```python
def standardize_date(date_str):
    """Convert to ISO 8601 format"""
    from datetime import datetime
    dt = datetime.strptime(date_str, "%m/%d/%Y")
    return dt.strftime("%Y-%m-%d")
```

**Applied to:** Cross-source date field alignment

***

### Phone Number Normalization

```python
def format_phone(phone):
    """Extract digits and format"""
    digits = ''.join(c for c in phone if c.isdigit())
    if len(digits) == 10:
        return f"{digits[:3]}-{digits[3:6]}-{digits[6:]}"
    return digits
```

**Applied to:** Contact information validation

***

### Currency Conversion

```python
def usd_to_cents(amount):
    """Convert USD to cents"""
    return int(float(amount) * 100)
```

**Applied to:** Financial data comparison

***

## Function Best Practices

{% hint style="info" %}
**Writing Effective Functions:**

1. ✅ Keep functions focused on single transformation
2. ✅ Handle null/None values gracefully
3. ✅ Add input validation for type safety
4. ✅ Use descriptive names and docstrings
5. ✅ Test with edge cases and sample data
6. ✅ Document expected input/output formats
7. ✅ Optimize for performance with large datasets
   {% endhint %}

***

## Function Constraints & Limits

| Constraint         | Limit       | Details                   |
| ------------------ | ----------- | ------------------------- |
| **Function Size**  | 64 KB       | Maximum code length       |
| **Execution Time** | 5 seconds   | Per value timeout         |
| **Memory**         | 512 MB      | Process memory limit      |
| **External Calls** | Restricted  | Network calls not allowed |
| **File Access**    | Not allowed | No filesystem access      |

***

## Debugging Functions

### Test Your Function Locally

```python
# Test locally before deployment
def clean_name(name):
    """Remove extra whitespace"""
    return ' '.join(name.split())

# Test cases
test_cases = [
    "John  Doe",
    "  Jane  Smith  ",
    "Bob"
]

for test in test_cases:
    print(f"'{test}' -> '{clean_name(test)}'")
```

### View Function Logs

Check execution logs when applied in jobs to debug transformation issues.

### Handle Errors Gracefully

```python
def safe_transform(value):
    """Handle None and invalid values"""
    if value is None or value == "":
        return None
    try:
        return transform_logic(value)
    except Exception as e:
        # Log or return default value
        return value
```

***

## Performance Tips

| Optimization               | Benefit                          |
| -------------------------- | -------------------------------- |
| **Avoid nested loops**     | Reduces O(n²) complexity         |
| **Use built-in functions** | Faster than custom loops         |
| **Cache lookups**          | Avoid repeated calculations      |
| **Batch processing**       | Process multiple values together |
| **Pre-compile regex**      | Speed up pattern matching        |

***

## Related Documentation

* [Creating Functions Guide](/data-testing/functions-and-transformations/index/creating-functions.md)
* [Using Functions Guide](/data-testing/functions-and-transformations/index/using-functions.md)
* [Demo: Create Functions](/data-testing/demo-walkthroughs/index/create-functions.md)
* [Demo: Use Functions](/data-testing/functions-and-transformations/index/using-functions.md)
* [Create Compare Job](/data-testing/jobs-and-workflows/index/compare-job.md)

***

## FAQ

**Q: Can I use external libraries?** A: Standard Python libraries are supported. Third-party packages must be pre-approved.

**Q: How many functions can I create?** A: Unlimited functions. You can organize them by purpose or data domain.

**Q: Can I edit a function after deploying?** A: Yes, you can edit functions anytime. Changes apply to new job runs.

**Q: What if my transformation fails?** A: Implement error handling to return default values or null.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.qyrus.com/data-testing/functions-and-transformations/index.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
