Types of Tasks
Lioth is designed as a general-purpose protocol for distributing tasks to humans and verifying outputs through Proof of Human Knowledge. The protocol supports a wide range of task formats, from high volume microtasks to expert workflows and longitudinal research. Task types are grouped below by the kind of human signal they produce and the typical validation requirements.
Research and Survey Tasks
These tasks collect structured answers from contributors, typically as multiple-choice, rating scales, or short-written responses. Common requester needs include measuring attitudes, gathering demographic breakdowns, testing messaging, or collecting user feedback. The output may include a JSON row per participant with fields such as answers, timing metadata, and optional free-text rationale. Validation focuses on completion integrity, attention checks, and anomaly signals.
Interactive Study and Experiments
These tasks measure behavior in a framed environment, such as a web experiment, a prototype test, or a stimulus-response workflow. Execution may happen in an external study environment, while Lioth handles eligibility, verification receipts, and settlement.
Requesters look for reaction-based experiments, choice modeling, memory or perception tests, and controlled behavioral tasks. Outputs may be structured logs and experiment results mapped into a standard schema, and a protocol receipt indicating completion and verification strength.
Validation looks for protocol compliance, attention checks, session integrity, and patterns across participants.
Human Rating, Comparison, and Preference Tasks
The requester looks for human judgement at scale, and provides items to compare or evaluate so contributors score them using a rubric. Some concrete examples could be:
- A contributor ranks five product descriptions by clarity.
- A contributor compares two model answers and selects the better one.
- A contributor scores a response on correctness, helpfulness, and safety from 1 to 5, with a short justification.
The outputs include scores, ranked lists, chosen options, and optional justifications, and validation looks for agreement rates and audit rates that can be quantified and reported.
Labeling and Categorization Tasks
These tasks convert content into labels. They are often used for training data, search quality, and analytics. Examples:
- Assign a category to a support ticket.
- Label whether a comment is abusive, hateful, or benign.
- Mark which search result is more relevant.
- Extract entities such as company names, locations, or product types into fields.
Outputs include discrete labels, multi-label tags, and structured extracted fields.
Validation uses rubric review, gold-task calibration, and consensus thresholds to reduce noise.
Data Extraction and Verification Microtasks
These tasks produce structured information from messy sources, or verify that information is correct. Examples:
- Read a webpage snippet and extract price, brand, and model into fields Verify whether an address format is valid without storing the full address on-chain Check whether two records refer to the same entity (deduplication)
- Validate that a claim is supported within the provided context.
Common outputs include structured fields, boolean checks, and confidence values.
Validation focuses on consistency, duplication detection, and audit sampling for high-impact fields.
Content Moderation and Policy Review
These tasks apply a policy rubric to content that can be used for safety labeling, compliance screening, or platform governance workflows. Examples:
- Decide whether a post violates policy and select the violation category.
- Rate severity and recommend an action.
- Identify whether content contains disallowed personal data.
Outputs include label decisions and a rubric-based rationale, and validation could require higher tiers due to subjectivity and adversarial pressure.
UX Testing and Qualitative Feedback
These collect user feedback on interfaces, flows, or prototypes. They can include structured ratings, feedback, or step-by-step usability traces. Examples:
- Navigate a prototype and report where you get stuck.
- Complete a checkout flow and rate trust, clarity, and friction.
- Answer a short set of questions after using a feature.
Outputs often may include ratings, structured issue reports, and optional media files depending on privacy mode.
Validation focuses on completion proof and rubric compliance.
Research, Synthesis, and Expert Jobs
These are longer tasks where a requester wants a coherent deliverable rather than a single label, often involving multi-step reasoning and higher skill. Examples:
- Produce a structured research brief comparing three vendors with cited sources provided by the requester.
- Write a step-by-step operating procedure for a workflow.
- Perform expert adjudication on borderline moderation cases.
Typical outputs include structured documents, decision logs, and evidence fields.
Validation requires using milestone-based reviews, higher validator quorum, and arbitration support.
Forecasting and Scenario Tasks
These tasks produce probabilistic judgements about future events with rationale. Examples:
- Assign a probability to a statement resolving within 3 months and justify the decision.
- Provide a scenario tree with likelihood weights.
- Update probabilities after new information is provided.
Outputs may include probabilities, confidence intervals and rationale fields.
Validation focuses on calibration, consistency over time and audit review for outliers.
Longitudinal and Recursive Tasks
These tasks repeat over time with the same participants or cohort. They are used for diary studies, recurring evaluation, or follow-up experiments. Concrete examples:
-
Weekly product usage diary entries.
-
Repeated preference evaluations over multiple model versions.
-
Follow-up interviews after an initial survey.
Other Tasks
The protocol can adapt to other types of tasks driven by demand of requesters. This ensures to be in the front line of the market’s needs, focusing on adaptability and adoption. It supports tasks where the output is a choice, a score, a label, a structured form, short or long explanations, verified by multi-validator consensus and audits.