How To: upsert_rows() in DataversePython

How to Use the `upsert_rows()` Method in DataversePython

The upsert_rows() method in the DataverseClient class is designed for advanced data migration and update scenarios in Microsoft Dataverse. Unlike insert_rows(), which lets Dataverse generate GUIDs for new records, upsert_rows() gives you full control over record GUIDs—essential for migrating existing databases and maintaining relationships. It also supports a true update mode, allowing you to update only existing records, which is perfect for enriching data after an initial migration.

What is `upsert_rows()`?

upsert_rows() lets you insert or update records in Dataverse entities using a pandas DataFrame. You specify the primary key column, so you can preserve relationships and GUIDs from your source system. The behavior of the method is controlled by the only_update_if_exists parameter:

Upsert: If only_update_if_exists=False (default), the method will insert new records with your specified GUIDs if they do not exist, or update them if they already exist. This is ideal for migrations where you want to preserve existing relationships and record GUIDs.
Update: If only_update_if_exists=True, the method will only update records that already exist in Dataverse, skipping any that do not. This is useful for incremental updates, such as adding lookup fields or updating option sets after an initial migration.

Method Signature

upsert_rows(
    entity: str,
    df: pd.DataFrame,
    primary_key_col: str,
    only_update_if_exists: bool = False
) -> None

Parameters Explained

entity (str, required):
The plural name of the Dataverse entity (e.g., 'accounts', 'contacts').
df (pd.DataFrame, required):
DataFrame containing the data to upsert. Must include a column with the primary key GUIDs.
primary_key_col (str, required):
The name of the column in your DataFrame that contains the GUIDs for each record.
only_update_if_exists (bool, optional):
If True, only updates existing records. If False, inserts new records or updates existing ones.

Basic Example: Upsert Accounts

from DataversePython.DataverseClient import DataverseClient
import pandas as pd

client = DataverseClient('sample_config.json')

# DataFrame with GUIDs from legacy system
accounts_df = pd.DataFrame({
    'accountid': ['guid1', 'guid2'],
    'name': ['Contoso', 'Fabrikam'],
    'revenue': [100000, 250000]
})

# Upsert accounts, preserving GUIDs
client.upsert_rows(
    entity='accounts',
    df=accounts_df,
    primary_key_col='accountid',
    only_update_if_exists=False
)

Example: Update Only Existing Records

# Update only existing accounts with new lookup fields
update_df = pd.DataFrame({
    'accountid': ['guid1', 'guid2'],
    'primarycontactid@odata.bind': ['/contacts(contact-guid1)', '/contacts(contact-guid2)'],  # Lookup field
    'industrycode': ['1', '2']  # Optionset field
})

client.upsert_rows(
    entity='accounts',
    df=update_df,
    primary_key_col='accountid',
    only_update_if_exists=True
)

Advanced Example: Upsert Different Column Types

Optionset: Pass the integer value for the optionset column (e.g., '1') as a string.
Multioptionset: Pass a integer values as string seperated by comma (e.g., '1, 2') for multioptionset columns.
Lookup: Pass '/' followed by the related entitys plural name with the GUID in parentheses (e.g., /contacts(contact-guid1)).

# Upsert with optionset, multioptionset, and lookup fields
accounts_df = pd.DataFrame({
    'accountid': ['guid1'],
    'name': ['Contoso'],
    'industrycode': ['1'],  # Optionset
    'custom_multioptionset': ['1, 2'],  # Multioptionset
    'primarycontactid@odata.bind': ['/contacts(contact-guid1)']  # Lookup
})

client.upsert_rows(
    entity='accounts',
    df=accounts_df,
    primary_key_col='accountid',
    only_update_if_exists=False
)

Tips

For lookups, the column name needs '@odata.bind' appended. For multi entity lookups, like parentcustomerid, you need to append the specific entity to the column name. e.g. parentcustomerid_contacts@odata.bind or parentcustomerid_accounts@odata.bind
Use only_update_if_exists=True for incremental updates after initial migration.
Check DataverseClient.log for detailed error messages and troubleshooting.
Refer to the Microsoft Dataverse Web API documentation for advanced scenarios.

Conclusion

The upsert_rows() method is essential for data migration and advanced update scenarios in Dataverse. It gives you full control over record IDs, supports all major column types, and enables incremental updates—making it a must-have tool for any Dynamics 365 or Power Platform specialist working with Python.

Happy coding and successful migrations!

How To insert_rows() in DataversePython

How to Use the `insert_rows()` Method in DataversePython

The insert_rows() method in the DataverseClient class allows you to insert data from a pandas DataFrame into a Microsoft Dataverse entity. This article explains its usage, parameters, and provides practical examples to help you get started.

What is `insert_rows()`?

The insert_rows() method lets you add multiple records to any Dataverse entity (such as accounts, contacts, or custom tables) by passing a pandas DataFrame. Each row in the DataFrame is sent as a new record to Dataverse.

Method Signature

insert_rows(
    entity: str,
    df: pd.DataFrame
) -> None

Parameters Explained

entity (str, required):
The plural name of the Dataverse entity you want to insert into (e.g., 'accounts', 'contacts').
df (pd.DataFrame, required):
The DataFrame containing the rows to insert. Each column should match a field in the entity schema.

Basic Example: Insert New Accounts

from DataversePython.DataverseClient import DataverseClient
import pandas as pd

# Initialize the client with your configuration file
prod = DataverseClient('sample_config.json')

# Create a DataFrame with new account data
new_accounts = pd.DataFrame([
    {"name": "Contoso Ltd", "emailaddress1": "info@contoso.com"},
    {"name": "Fabrikam Inc", "emailaddress1": "contact@fabrikam.com"}
])

# Insert the new accounts
prod.insert_rows(entity='accounts', df=new_accounts)

Example: Insert Contacts with Basic Fields

# Create a DataFrame for new contacts
contacts = pd.DataFrame([
    {"firstname": "Alice", "lastname": "Smith", "emailaddress1": "alice.smith@example.com"},
    {"firstname": "Bob", "lastname": "Jones", "emailaddress1": "bob.jones@example.com"}
])

prod.insert_rows(entity='contacts', df=contacts)

Example: Insert Contacts with Optionset Fields

# Example: Insert contacts with an optionset (choice) field
        # Assume 'preferredcontactmethodcode' is an optionset field (e.g., 1=Email, 2=Phone)
        contacts_optionset = pd.DataFrame([
            {"firstname": "Charlie", "preferredcontactmethodcode": 1},
            {"firstname": "Dana", "preferredcontactmethodcode": 2}
        ])

        prod.insert_rows(entity='contacts', df=contacts_optionset)

Example: Insert Contacts with Lookup Field (parentcustomerid)

# Example: Insert contacts with a lookup field (parentcustomerid)
        # The parentcustomerid field can reference an account or contact (customer lookup)
        # Use the format: {'parentcustomerid_{entity}@odata.bind': '/accounts()'}
        contacts_lookup = pd.DataFrame([
            {
                "firstname": "Eve",
                "lastname": "Adams",
                "emailaddress1": "eve.adams@example.com",
                "parentcustomerid_accounts@odata.bind": "/accounts(00000000-0000-0000-0000-000000000001)"
            },
            {
                "firstname": "Frank",
                "lastname": "Miller",
                "emailaddress1": "frank.miller@example.com",
                "parentcustomerid_contacts@odata.bind": "/contacts(00000000-0000-0000-0000-000000000002)"
            }
        ])

        prod.insert_rows(entity='contacts', df=contacts_lookup)

Example: Insert Contacts with Multi-Lookup Field

# Example: Insert contacts with a multi-lookup field
        # Assume 'new_multilookupfield' is a custom multi-lookup field (e.g. 1=training,2=consulting,3=developing)  
        contacts_multilookup = pd.DataFrame([
            {
                "firstname": "Grace",
                "lastname": "Hopper",
                "emailaddress1": "grace.hopper@example.com",
                "new_multilookupfield": "1, 3"
            }
        ])

        prod.insert_rows(entity='contacts', df=contacts_multilookup)

Notes and Best Practices

Each row in the DataFrame is sent as a separate POST request to Dataverse.
If a row fails to insert, the method logs an error (including the error code and response) to DataverseClient.log.
Make sure your DataFrame columns match the entity's schema in Dataverse.
For large inserts (more than 50k rows), consider batching in multiple DataFrames.
A insert_rows_batch() is already on the roadmap 😉

Error Handling

If an insertion fails (e.g., due to validation or permission issues), an error is logged to DataverseClient.log and the method continues with the next row. Always check this log file for troubleshooting.

Tips

Always use the plural form of the entity name (e.g., 'accounts', not 'account').
Refer to the Microsoft Dataverse Web API documentation for field names and data types.
Use the VSCode DataWrangler Extension and Jupyter notebooks for interactive data preparation.

Conclusion

The insert_rows() method makes it easy to add new records to Dataverse entities from pandas DataFrames, enabling seamless integration between Python and Microsoft Dataverse for automation and data migration scenarios.

Happy coding!

How To: get_rows() in DataversePython

How to Use the `get_rows()` Method in DataversePython

The get_rows() method in the DataverseClient class is a powerful tool for retrieving data from Microsoft Dataverse entities directly into a pandas DataFrame. This article will walk you through its usage, parameters, and provide practical examples to help you get started.

What is `get_rows()`?

The get_rows() method allows you to query any Dataverse entity (such as accounts, contacts, or custom tables) and return the results as a pandas DataFrame. This makes it easy to analyze, filter, and manipulate your Dataverse data using Python's data science ecosystem.

Method Signature

get_rows(
    entity: str,
    top: int | None = None,
    columns: list = [],
    filter: str | None = None,
    include_odata_annotations: bool = False
) -> pd.DataFrame

Parameters Explained

entity (str, required):
The plural name of the Dataverse entity you want to query (e.g., 'accounts', 'contacts').
top (int, optional):
The maximum number of rows to retrieve. If not set, all available rows are returned.
columns (list, optional):
A list of column names to include in the result. If empty, all columns are returned.
filter (str, optional):
An OData filter string to limit the results (e.g., 'revenue gt 100000').
include_odata_annotations (bool, optional):
If True, includes OData annotations in the response. Usually not needed for basic data analysis.

Basic Example: Retrieve All Accounts

from DataversePython.DataverseClient import DataverseClient

# Initialize the client with your configuration file
client = DataverseClient('sample_config.json')

# Retrieve all accounts
df = client.get_rows(entity='accounts')

print(df.head())

Example: Select Specific Columns

# Retrieve only the 'name' and 'emailaddress1' columns from accounts
accounts_df = client.get_rows(
    entity='accounts',
    columns=['name', 'emailaddress1']
)

print(accounts_df.head())

Example: Filter Results

# Retrieve accounts with revenue greater than 100,000
filtered_df = client.get_rows(
    entity='accounts',
    filter='revenue gt 100000'
)

print(filtered_df.head())

Example: Limit Number of Rows

# Retrieve only the first 10 contacts
contacts_df = client.get_rows(
    entity='contacts',
    top=10
)

print(contacts_df)

Example: Combine Parameters

# Retrieve the top 5 accounts with a filter and specific columns
custom_df = client.get_rows(
    entity='accounts',
    top=5,
    columns=['name', 'accountnumber'],
    filter="statecode eq 0"  # Only active accounts
)

print(custom_df)

Error Handling

If the request to Dataverse fails (e.g., due to authentication or network issues), an exception is raised and the error is logged to DataverseClient.log. Always check this log file for troubleshooting.

Tips

Always use the plural form of the entity name (e.g., 'accounts', not 'account').
For advanced filtering, refer to the Microsoft Dataverse Web API documentation.
Use the VSCode DataWrangler Extension and Jupyter notebooks for interactive data exploration.

Conclusion

The get_rows() method is a flexible and efficient way to bring Dataverse data into your Python workflow. With support for filtering, column selection, and pagination, it enables powerful data analysis and automation scenarios.

Happy coding!

Supercharge Dataverse with Python and Pandas

Introducing DataversePython: A Powerful Python Client for Microsoft Dataverse

Dataverse is the heart and soul of Microsoft Power Platform and especially Dynamics 365. However, anyone who has worked extensively with Dataverse knows that the out-of-the-box tools—while powerful—can be limiting, especially when it comes to advanced data management, data imports and maintaining your data.

Why a Python-Based DataverseClient?

While Microsoft provides tools like Dataflows or Excel based imports, you'll hit limitations fairly quickly:

Complex Relationships: Managing many-to-many (m:n) relationships between entities isn't supported with the standard tools. You're going to need XRM Toolbox.
Data Operations: Tasks like bulk updating, upserting or inserting require either excel imports or dataflows. Excel imports are pretty finicky and dataflows can't insert lookups or option and multioptionsets at once.
Data Migration: Migrating data from legacy systems or other sources into Dataverse is a common challenge. The process often involves multiple steps and tools, making it error-prone, time-consuming and really complex.

The Solution: Dataverse Web API

To address these challenges, I created DataversePython, a Python package designed to make working with Dataverse easier, faster, and more flexible—especially for developers and IT consultants who are already comfortable with Python and pandas.

Key Features

Data Operations: Easily create, read, update, and upsert records in Dataverse entities. No limitations on any type of lookups, optionsets and multioptionsets. Working with Activities has never been easier!
M:N Relationship Management: Create m:n relationships between entities, something that isn’t possible with standard tools.
Data Cleaning and Maintenance: Leverage the power of pandas to import, transform, and clean your data before pushing it to Dataverse.
Seamless Data Migration: Import data from virtually any source (CSV, Excel, SQL, etc.), transform it as needed, and insert it directly into Dataverse entities.
Logging and Transparency: Built-in logging helps you track operations and troubleshoot issues quickly.

Why Python and pandas?

Python is the language of choice for data professionals, and pandas is the de facto standard for data manipulation.

Typical Use Cases

Data Maintenance: Gain full control over maintenance tasks, such as merging records, cleaning data like unreachable websites.
Data Migration Projects: Import data from legacy systems, transform it, and load it into Dataverse — all within a single Python notebook or script.
Freedom of technologies: Use the power of pandas; your coding skills are your only limitations.

Getting Started

To get started, simply install the package via pip:

pip install DataversePython

Then, connect to your Dataverse environment and start managing your data with the full power of Python and pandas.
View the DataversePython GitHub repository for full documentation and guides on how to get started.

Method Reference & Detailed Instructions

For step-by-step guides and detailed usage instructions for each method in DataversePython, please refer to the following resources:

Conclusion:
DataversePython bridges the gap between the flexibility of Python and the power of Microsoft Dataverse. Whether you’re cleaning data, managing complex relationships, or migrating entire datasets, this package empowers you to do more with less effort. If you’re working with Power Platform or Dynamics 365, give DataversePython a try and supercharge your data workflows!

Recommended

Supercharge Dataverse with Python and Pandas

How To: upsert_rows() in DataversePython

How to Use the upsert_rows() Method in DataversePython

What is upsert_rows()?

Method Signature

Parameters Explained

Basic Example: Upsert Accounts

Example: Update Only Existing Records

Advanced Example: Upsert Different Column Types

Tips

Conclusion

How To insert_rows() in DataversePython

How to Use the insert_rows() Method in DataversePython

What is insert_rows()?

Method Signature

Parameters Explained

Basic Example: Insert New Accounts

Example: Insert Contacts with Basic Fields

Example: Insert Contacts with Optionset Fields

Example: Insert Contacts with Lookup Field (parentcustomerid)

Example: Insert Contacts with Multi-Lookup Field

Notes and Best Practices

Error Handling

Tips

Conclusion

How To: get_rows() in DataversePython

How to Use the get_rows() Method in DataversePython

What is get_rows()?

Method Signature

Parameters Explained

Basic Example: Retrieve All Accounts

Example: Select Specific Columns

Example: Filter Results

Example: Limit Number of Rows

Example: Combine Parameters

Error Handling

Tips

Conclusion

Supercharge Dataverse with Python and Pandas

Introducing DataversePython: A Powerful Python Client for Microsoft Dataverse

Why a Python-Based DataverseClient?

The Solution: Dataverse Web API

Key Features

Why Python and pandas?

Typical Use Cases

Getting Started

Method Reference & Detailed Instructions

How to Use the `upsert_rows()` Method in DataversePython

What is `upsert_rows()`?

How to Use the `insert_rows()` Method in DataversePython

What is `insert_rows()`?

How to Use the `get_rows()` Method in DataversePython

What is `get_rows()`?