Skip to main content

Data Retention

In May 2024, LangSmith introduced a maximum data retention period on traces of 400 days. In June 2024, LangSmith introduced a new data retention based pricing model where customers can configure a shorter data retention period on traces in exchange for savings up to 10x. On this page, we'll go through how data retention works and is priced in LangSmith.

Why retention mattersโ€‹

Privacyโ€‹

Many data privacy regulations, such as GDPR in Europe or CCPA in California, require organizations to delete personal data once it's no longer necessary for the purposes for which it was collected. Setting retention periods aids in compliance with such regulations.

Costโ€‹

LangSmith charges less for traces that have low data retention. See our tutorial on how to optimize spend for details.

How it worksโ€‹

The basicsโ€‹

LangSmith now has two tiers of traces based on Data Retention with the following characteristics:

BaseExtended
Price.05ยข / trace.50ยข / trace
Retention Period14 days400 days

Data deletion after retention endsโ€‹

After the specified retention period, traces are no longer accessible via the runs table or API. All user data associated with the trace (e.g. inputs and outputs) is deleted from our internal systems within a day thereafter. Some metadata associated with each trace may be retained indefinitely for analytics and billing purposes.

Data retention auto-upgradesโ€‹

caution

Auto upgrades can have an impact on your bill. Please read this section carefully to fully understand your estimated LangSmith tracing costs.

When you use certain features with base tier traces, their data retention will be automatically upgraded to extended tier. This will increase both the retention period, and the cost of the trace.

The complete list of scenarios in which a trace will upgrade when:

  • Feedback is added to any run on the trace
  • An Annotation Queue receives any run from the trace
  • A Run Rule matches any run within a trace

Why auto-upgrade traces?โ€‹

We have two reasons behind the auto-upgrade model for tracing:

  1. We think that traces that match any of these conditions are fundamentally more interesting than other traces, and therefore it is good for users to be able to keep them around longer.
  2. We philosophically want to charge customers an order of magnitude lower for traces that may not be interacted with meaningfully. We think auto-upgrades align our pricing model with the value that LangSmith brings, where only traces with meaningful interaction are charged at a higher rate.

If you have questions or concerns about our pricing model, please feel free to reach out to support@langchain.dev and let us know your thoughts!

How does data retention affect downstream features?โ€‹

Annotation Queues, Run Rules, and Feedbackโ€‹

Traces that use these features will be auto-upgraded.

Monitoringโ€‹

The monitoring tab will continue to work even after a base tier trace's data retention period ends. It is powered by trace metadata that exists for >30 days, meaning that your monitoring graphs will continue to stay accurate even on base tier traces.

Datasetsโ€‹

Datasets have an indefinite data retention period. Restated differently, if you add a trace's inputs and outputs to a dataset, they will never be deleted. We suggest that if you are using LangSmith for data collection, you take advantage of the datasets feature.

Billing modelโ€‹

Billable metricsโ€‹

On your LangSmith invoice, you will see two metrics that we charge for:

  • LangSmith Traces (Base Charge)
  • LangSmith Traces (Extended Data Retention Upgrades).

The first metric includes all traces, regardless of tier. The second metric just counts the number of extended retention traces.

Why measure all traces + upgrades instead of base and extended traces?โ€‹

A natural question to ask when considering our pricing is why not just show the number of base tier and extended tier traces directly on the invoice?

While we understand this would be more straightforward, it doesn't fit trace upgrades properly. Consider a base tier trace that was recorded on June 30, and upgraded to extended tier on July 3. The base tier trace occurred in the June billing period, but the upgrade occurred in the July billing period. Therefore, we need to be able to measure these two events independently to properly bill our customers.

If your trace was recorded as an extended retention trace, then the base and extended metrics will both be recorded with the same timestamp.

Cost breakdownโ€‹

The Base Charge for a trace is .05ยข per trace. We priced the upgrade such that an extended retention trace costs 10x the price of a base tier trace (.50ยข per trace) including both metrics. Thus, each upgrade costs .45ยข.


Was this page helpful?


You can leave detailed feedback on GitHub.