Responsible Steam Market Scraping¶
We all have a one, but a big and tasty, pie ๐ฅง, called SteamMarket. We all want to get a piece of it ๐ฐ,
that's why we're here. However, it's crucial to approach market data collection responsibly.
Consume Steam resources responsibly. Bear in mind, we all benefit if Steam spends fewer resources fighting aggressive market scraping. Excessive scraping can lead to IP bans, rate limits, and a worse experience for everyone.
HTTP Caching with If-Modified-Since¶
One of the most effective ways to scrape responsibly is to implement proper HTTP caching using the If-Modified-Since 
header. This standard HTTP mechanism allows clients to:
- Retrieve data only when it has changed since the last request
 - Reduce bandwidth usage for both client and server
 - Minimize the risk of hitting rate limits
 - Create more efficient and responsive applications
 
How It Works¶
The HTTP caching mechanism works as follows:
- When you first request a resource, the server includes a 
Last-Modifiedheader in the response - For subsequent requests, you include an 
If-Modified-Sinceheader with the timestamp from the previous response - If the resource hasn't changed since that time, the server returns a 
304 Not Modifiedstatus code without the resource body - If the resource has changed, the server returns the updated resource with a new 
Last-Modifiedtimestamp 
In aiosteampy, this mechanism is implemented through the if_modified_since parameter and the ResourceNotModified exception.
Implementation in aiosteampy¶
Several methods in aiosteampy support the if_modified_since parameter:
SteamCommunityPublicMixin.get_item_orders_histogramSteamCommunityPublicMixin.get_item_listingsSteamCommunityPublicMixin.market_listings
These methods:
- Accept an optional 
if_modified_sinceparameter (either adatetimeobject or a formatted string) - Return a 
last_modifiedtimestamp along with the requested data - Raise a 
ResourceNotModifiedexception when the resource hasn't changed 
Basic Example¶
Here's a simple example of how to use this mechanism:
from aiosteampy import ResourceNotModified, SteamPublicClient
client = SteamPublicClient(...)
# Initial request to get data and last_modified timestamp
histogram, last_modified = await client.get_item_orders_histogram(123456)
# Later, when you need to check for updates
try:
    # Pass the previous last_modified timestamp
    histogram, last_modified = await client.get_item_orders_histogram(
        123456, 
        if_modified_since=last_modified,  # Use the timestamp from the previous response
    )
    # Process the updated data
    print("Data has been updated!")
    # Do something with the new histogram data
except ResourceNotModified:
    print("Data hasn't changed since last request")
    # Use your cached data instead
Advanced Implementation with Caching¶
For a more complete implementation with caching:
from aiosteampy import ResourceNotModified
import time
class SimpleCache:
    def __init__(self):
        self.data = {}
        self.timestamps = {}
    def get(self, key):
        return self.data.get(key), self.timestamps.get(key)
    def set(self, key, data, timestamp):
        self.data[key] = data
        self.timestamps[key] = timestamp
# Create a cache
cache = SimpleCache()
item_nameid = 123456
async def get_histogram_with_cache(client, item_nameid):
    # Try to get from cache
    cached_data, last_modified = cache.get(item_nameid)
    try:
        # Always make the request, but with if_modified_since if we have cached data
        histogram, new_last_modified = await client.get_item_orders_histogram(
            item_nameid,
            if_modified_since=last_modified if last_modified else None
        )
        # Update cache with new data
        cache.set(item_nameid, histogram, new_last_modified)
        return histogram
    except ResourceNotModified:
        # If data hasn't changed, use cached data
        print("Using cached data - resource not modified")
        return cached_data
Benefits¶
Using the If-Modified-Since mechanism provides several benefits:
- Reduced Bandwidth: You only download the full data when it has actually changed
 - Fewer Rate Limits: You're less likely to hit Steam's 
429: Too Many Requestserrors - Faster Responses: 304 responses are faster as they don't include the resource body
 - Server-Friendly: Reduces load on Steam's servers, making you a good API citizen
 - More Reliable: Your application can continue to function even during high-traffic periods
 
By implementing proper caching with the if_modified_since parameter, you can create more efficient and 
reliable applications that interact with the Steam Market.