We've just released a new feature in CaptureKit based on user feedback: Markdown extraction. This update enhances CaptureKit's content extraction capabilities, allowing you to retrieve clean, structured text from web pages.

📝 Markdown Extraction (in /content API)

You can now extract the Markdown representation of web pages using the /content endpoint. This makes it easier to work with the textual content of web pages in a format that's both human-readable and machine-processable.

Example Request

GET https://api.capturekit.dev/content?access_key=&url=https://capturekit.dev&include_markdown=true

Example Response

{
  "success": true,
  "data": {
    "metadata": { ... },
    "links": { ... },
    "html": "Hello, world!",
    "markdown": "# Hello, world!"
  }
}



    Enter fullscreen mode
    


    Exit fullscreen mode
    





  
  
  Parameters


url (string, required): URL of the webpage

access_key (string, required): Your API key

include_markdown (boolean, optional): Set to true to include Markdown data (defaults to false)

  
  
  Why Markdown?
Markdown provides several advantages over raw HTML:

Readability: Markdown is cleaner and easier to read than HTML

Simplicity: It removes unnecessary styling and formatting

Portability: Easy to use in various applications and platforms

Text Processing: Ideal for content analysis, summarization, and AI processing

  
  
  Use Cases


Content Management: Import web content directly into your CMS

AI Processing: Feed web content to LLMs and other AI systems in a clean format

Documentation: Extract documentation from websites for offline use

Knowledge Bases: Build internal knowledge repositories from web content

  
  
  Final Notes
This feature was developed in direct response to user feedback. We're committed to building CaptureKit to meet your real-world needs.Have ideas for more features? Let us know! We're actively developing CaptureKit based on user input.Thanks for being part of the journey!