•  


[Test] Add integration tests for complex and larger variety of webpages · Issue #15 · mendableai/firecrawl · GitHub
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement . We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Test] Add integration tests for complex and larger variety of webpages #15

Closed
2 tasks
oliviermills opened this issue Apr 18, 2024 · 2 comments
Closed
2 tasks
Assignees
Labels

Comments

@oliviermills
Copy link

In tweaking and growing the html clean up and html-to-md. I highly recommend adding integration tests using either live webpages (to test also the get/network and dynamic websites) OR at least saved html pages with complex layout (and bad html, especially for the html clean up.

  • Find a list of pages to use as test suite with a vareity of layouts
  • Add the integration tests
@nickscamara
Copy link
Member

  • Add more specific tests like h1 -> #
  • Get a list of website -> markdown small dataset

@rafaelsideguide
Copy link
Collaborator

Closing this one (related to #118 )

Sign up for free to join this conversation on GitHub . Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants
- "漢字路" 한글한자자동변환 서비스는 교육부 고전문헌국역지원사업의 지원으로 구축되었습니다.
- "漢字路" 한글한자자동변환 서비스는 전통문화연구회 "울산대학교한국어처리연구실 옥철영(IT융합전공)교수팀"에서 개발한 한글한자자동변환기를 바탕하여 지속적으로 공동 연구 개발하고 있는 서비스입니다.
- 현재 고유명사(인명, 지명등)을 비롯한 여러 변환오류가 있으며 이를 해결하고자 많은 연구 개발을 진행하고자 하고 있습니다. 이를 인지하시고 다른 곳에서 인용시 한자 변환 결과를 한번 더 검토하시고 사용해 주시기 바랍니다.
- 변환오류 및 건의,문의사항은 juntong@juntong.or.kr로 메일로 보내주시면 감사하겠습니다. .
Copyright ⓒ 2020 By '전통문화연구회(傳統文化硏究會)' All Rights reserved.
 한국   대만   중국   일본