We've just released a new feature in CaptureKit based on user feedback: Markdown extraction. This update enhances CaptureKit's content extraction capabilities, allowing you to retrieve clean, structured text from web pages.
📝 Markdown Extraction (in /content API)
You can now extract the Markdown representation of web pages using the /content endpoint. This makes it easier to work with the textual content of web pages in a format that's both human-readable and machine-processable.
Example Request
GET https://api.capturekit.dev/content?access_key=&url=https://capturekit.dev&include_markdown=true
Example Response
{
"success": true,
"data": {
"metadata": { ... },
"links": { ... },
"html": "Hello, world!",
"markdown": "# Hello, world!"
}
}
Enter fullscreen mode
Exit fullscreen mode
Parameters
url (string, required): URL of the webpage
access_key (string, required): Your API key
include_markdown (boolean, optional): Set to true to include Markdown data (defaults to false)
Why Markdown?
Markdown provides several advantages over raw HTML:
Readability: Markdown is cleaner and easier to read than HTML
Simplicity: It removes unnecessary styling and formatting
Portability: Easy to use in various applications and platforms
Text Processing: Ideal for content analysis, summarization, and AI processing
Use Cases
Content Management: Import web content directly into your CMS
AI Processing: Feed web content to LLMs and other AI systems in a clean format
Documentation: Extract documentation from websites for offline use
Knowledge Bases: Build internal knowledge repositories from web content
Final Notes
This feature was developed in direct response to user feedback. We're committed to building CaptureKit to meet your real-world needs.Have ideas for more features? Let us know! We're actively developing CaptureKit based on user input.Thanks for being part of the journey!