Slithery sync ups, Airbyte sinks teeth into open source Python library
Open source ‘data movement’ platform company Airbyte has noted that its PyAirbyte open source Python library (introduced in late February) has helped more than 10,000 AI and data engineers to sync over 6 billion records of data.
Users have completed more than 221,000 PyAirbyte sync jobs, or over 10,000 syncs per week.
PyAirbyte now boasts an average of 25,000 monthly downloads, according to metrics from the PyPi Python package repository.
Python is a popular programming language for data engineers because of its simplicity, versatility and scalability. It also offers a wide range of libraries and frameworks for data manipulation, exploration and visualisation. Python easily integrates with tools commonly used in data analysis and data science, such as SQL databases, Hadoop and Spark.
Apache Arrow
Significant community contributions have enhanced PyAirbyte’s capabilities, including the following features: Docker executor and networking support; integration with Apache Arrow; support for FastAPI and other web frameworks; an expanded set of BigQuery authentication options; and other numerous bug fixes and improvements.
Users extract data to populate AI frameworks like LangChain and LlamaIndex and facilitate building LLM-powered applications.
Also popular among users is moving data from Amazon, Facebook, Google, Hubspot and Salesforce to make data-driven marketing decisions. Next most popular is help desk or IT related with data from Jira, GitHub and Zendesk. PyAirbyte consolidates data from various sources for analysis to improve decision making.
PyAirbyte delights
PyAirbyte makes it easy to move data across API sources and destinations by enabling Airbyte resources to be created and managed using code, rather than the user interface (UI), providing a natural fit into the coding workflow. Python users have access to Airbyte’s more than 250 data connectors – rather than having to build and maintain those themselves. Airbyte is the first to provide Python users this capability with the availability of over 250 connectors.
“With a majority of existing data pipelines written in Python today, we expected PyAirbyte to be popular but it has exceeded our expectations,” said Michel Tricot, co-founder and CEO, Airbyte. “We’ve seen more than 150 unique data sources used to move data to destinations, like AI frameworks and data warehouses.”
PyAirbyte is an addition to the Airbyte API and Terraform Provider, which enables programmatic management of Airbyte resources, streamlining workflows and integrating Airbyte configurations with existing data infrastructure. Other deployment models include Airbyte Open Source, Airbyte Self-Managed Enterprise, Airbyte Cloud and Powered by Airbyte.