About StaticFrame
StaticFrame is an alternative DataFrame library built on an immutable data model. An immutable data model reduces opportunities for error and offers very significant performance advantages for some operations. (Read more about the benefits of immutability here.)
StaticFrame is not a drop-in replacement for Pandas. While some conventions and API components are directly borrowed from Pandas, some are completely different, either by necessity (due to the immutable data model) or by choice (offering more uniform, less redundant, and more explicit interfaces). (Read more about differences between Pandas and StaticFrame here.)
An interactive Jupyter notebook, with a guided introduction to using StaticFrame, is available here.
StaticFrame’s API documentation features thousands of code examples, covering nearly every end-point in the API. For each component, an “overview” and a “detail” section is provided. The overview lists one interface and its signature per line. (See the overview of Series
constructors here.) The detail section provides full documentation as well as extensive code examples. (See the detail of Frame
methods here.)
Please assist in development by reporting bugs or requesting features. We are a welcoming community and appreciate all feedback! Visit GitHub Issues. To get started contributing to StaticFrame, see Contributing.
The ideas behind StaticFrame developed out of years of work with Pandas and related tabular data structures by the Investment Systems team at Research Affiliates, LLC. In May of 2017 Christopher Ariza proposed the basic model to the Investment Systems team and began implementation. The first public release was in May 2018.
Media
Articles
2023: Type-Hinting DataFrames for Static Analysis and Runtime Validation:
2022: The Performance Advantage of No-Copy DataFrame Operations
2022: One Fill Value Is Not Enough: Preserving Columnar Types When Reindexing DataFrames
2022: StaticFrame from the Ground Up: Getting Started with Immutable DataFrames
2022: Using Higher-Order Containers to Efficiently Process 7,163 (or More) DataFrames
Presentations
PyCon US 2023: “Building NumPy Arrays from CSV Files, Faster than Pandas”: https://www.youtube.com/watch?v=ppPXPVV4rDc
PyCon US 2022: “Employing NumPy’s NPY Format for Faster-Than-Parquet DataFrame Serialization”: https://youtu.be/HLH5AwF-jx4
PyData Global 2021: “Why Datetimes Need Units: Avoiding a Y2262 Problem & Harnessing the Power of NumPy’s datetime64”: https://www.youtube.com/watch?v=jdnr7sgxCQI
PyData LA 2019: “The Best Defense is not a Defensive Copy” (lightning talk starting at 18:25): https://youtu.be/_WXMs8o9Gdw?t=1105
PyData LA 2019: “Fitting Many Dimensions into One The Promise of Hierarchical Indices for Data Beyond Two Dimensions”: https://youtu.be/xX8tXSNDpmE
PyCon US 2019: “A Less Kind, Less Gentle DataFrame” (lightning talk starting at 53:00): https://pyvideo.org/pycon-us-2019/friday-lightning-talksbreak-pycon-2019.html
Talk Python to Me, interview: https://talkpython.fm/episodes/show/204/staticframe-like-pandas-but-safer
PyData LA 2018: “StaticFrame: An Immutable Alternative to Pandas”: https://pyvideo.org/pydata-la-2018/staticframe-an-immutable-alternative-to-pandas.html