Agents
Introducing COMPOSITE-STEM, 70 expert-curated agentic tasks across Physics, Biology, Chemistry, and Math, compatible with the Harbor Framework.
In our recent paper, we introduce AsymmetryZero, a framework for operationalizing human expert preferences as semantic evals.
Portex has integrated with Harbor, a popular framework to test agents on realistic workflows.