ARC Data Analytics Handbook

Version 0.0.2

All things data analytics at ARC Resources.

Data Analytics Handbook

This is documentation for the use of our Data Analytics tools and platforms.

Who is this for?

This documentation is intended for anyone who uses our data analytics tools and platforms. This includes Data Users, Citizen Data Scientists, and Data Analytics Developers.

  • Data Users - People at ARC who use data to make decisions. They may use Power BI, Spotfire, Excel, SQL, or other tools to access and analyze data. Documentation for this audience is intended to help them understand how to use these tools and platforms from a novice perspective.
  • Citizen Data Scientists - People at ARC who use data to make decisions and have some technical skills. They may use Python, Databricks, or other tools to access and analyze data. Documentation for this audience is intended to help them understand how to use these tools and platforms from an intermediate perspective.
  • Data Analytics Developers - People who’s full time job is to develop data analytics solutions. They include our Data Analytics organization. Documentation for this audience is intended to help them understand how to use these tools and platforms from an advanced perspective.

We strive to make our documentation accessible to all of these audiences. If you have any feedback on how we can improve our documentation, please let us know.

Data Analytics Platforms

We use Databricks as our primary data analytics platform. It is a cloud-based platform that allows us to store, process, and analyze data at scale.

Unity Catalog

Unity Catalog is a unified governance solution for all data assets in Databricks. It provides a centralized view of all data assets and allows us to manage access to data at scale.

Code Repositories

We use GitHub as our primary code repository. Azure DevOps we use for task management and release planning. They allow us to store, version, and collaborate on our code.

Data Analytics Tools

We use a variety of tools to access and analyze data. These tools are used by our Data Users, Citizen Data Scientists, and Data Analytics Developers alike.

Visualization Tools

Visualization tools are used to create interactive reports and dashboards. They allow users to visualize data and share insights across the organization.

Power BI

Power BI is a business analytics tool that helps visualize data and share insights across the organization. It allows users to create interactive reports and dashboards.

Spotfire

Spotfire is a data visualization and analytics tool that allows users to create interactive dashboards and reports. It is generally used in our Geoscience and Engineering teams.

Databricks Dashboards

Databricks Dashboards is a feature of Databricks that allows users to create interactive dashboards and reports.

Code Development Tools

Python

Python is a programming language that is widely used for data analysis and machine learning.

SQL

SQL is a programming language that is used to query relational databases.

Databricks Notebooks

Databricks Notebooks are a feature of Databricks that allows users to create and share documents that contain live code, equations, visualizations, and narrative text.

GitHub Copilot

GitHub Copilot is an AI-powered code completion tool that helps developers write code faster and with fewer errors. It is integrated into Visual Studio Code and other IDEs.

Visual Studio Code

Visual Studio Code is a code editor that is widely used for data analysis and machine learning. It is integrated with GitHub and other tools.

Other Tools

We also use a variety of other tools to access and analyze data.

Data Analytics Processes

We have a variety of processes that we follow to ensure that our data analytics solutions are developed and deployed in a consistent and efficient manner.

How do we develop and deploy data analytics solutions?

We follow a set of processes to develop and deploy data analytics solutions. These processes are designed to ensure that our solutions are developed in a consistent and efficient manner.

ML Development and Deployment

This section is designed to help data scientists and ML engineers work more effectively throughout the machine learning project lifecycle. It brings together practical guidelines for key areas like coding best practices, model experimentation, registration, and monitoring. It also includes helpful tips on branching strategies, pull requests, and a development checklist to make collaboration smoother and project delivery easier.

Code Best Practices for Data Science/ML Projects

Last updated on 20 Mar 2026
Published on 20 Mar 2026
 Edit on GitHub