How to Scrape Apartments.com

Crawlbase - Apr 25 - - Dev Community

This blog was originally posted to Crawlbase Blog

Having the right information at the right time can make a difference for professionals and their clients. In the field of real estate, where accuracy can sometimes be crucial, one such platform is Apartments.com. With a vast number of property listings, market insights, and all neighborhood details among its datasets, Apartments.com provides important information that would be useful for home seekers, sellers, or even real estate agents who need to obtain data about customers. Over the last three months alone, Apartments.com has seen approximately 48.7 million visits, highlighting its popularity and utility in the industry.

Apartments.com monthly visitors 'Apartments.com monthly visitors'

Image Source

In this blog, we will show you how to scrape Apartments.com using JavaScript and the Crawlbase Crawling API. You’ll learn how to scrape essential property data like property title, about, price, location, features, size and much more without facing any blocks or restrictions.

Table of Contents

Step 1: Setup Necessary Tools for Custom Apartments.com Scraper

Step 2: Setup the Project

Step 3: Extract HTML data from Apartments.com

Step 4: Scrape Apartments.com in JSON format

Final Words

Frequently Asked Questions

Step 1: Setup Necessary Tools for Custom Apartments.com Scraper

Before we get into coding, let's set up our environment with the necessary tools. Here's what you'll need to get started:

Node.js allows you to run JavaScript locally, which is essential for executing our web scraping script. You can download Node.js from the official website. Since our project heavily relies on JavaScript, it's important to grasp fundamental concepts like variables, functions, loops, and basic DOM manipulation. If you're new to JavaScript, resources like Mozilla Developer Network (MDN) or W3Schools can be helpful.

Later in this tutorial, we will use the Crawlbase Crawling API to perform effective web scraping. Your API token will authenticate requests and enable the Crawling API's features. Obtain your token by creating an account on the Crawlbase website and accessing your API tokens from the account documentation section.

Prerequisites to custom Apartments.com scraper 'Prerequisites to custom Apartments.com scraper'

Step 2: Setup the Project

Here's how to set up your project for scraping Apartments.com data:

Create a New Project Folder:
Open your terminal and type mkdir apartment-scraper to make a new folder for your project.

mkdir apartment-scraper
Enter fullscreen mode Exit fullscreen mode

Navigate to the Project Folder:
Enter cd apartment-scraper to move into the newly created folder.

cd apartment-scraper
Enter fullscreen mode Exit fullscreen mode

Create a JavaScript File:
Type touch scraper.js to create a new JavaScript file named scraper.js in your project folder.

touch scraper.js
Enter fullscreen mode Exit fullscreen mode

Add the Crawlbase Package:
Install the Crawlbase Node library by running npm install crawlbase in your terminal. This library helps connect to the Crawlbase Crawling API for scraping Apartments.com data.

npm install crawlbase
Enter fullscreen mode Exit fullscreen mode

Install Fs, Cheerio:
Install the necessary modules with npm install fs cheerio. These modules support file system interactions, HTML parsing, and JSON to CSV conversion for your Apartments.com scraper project.

npm install fs cheerio
Enter fullscreen mode Exit fullscreen mode

After completing these steps, you'll be ready to build your Apartments.com data scraper!

Step 3: Extract HTML data from Apartments.com

Apartments.com home page 'Apartments.com home page'

Now that you have your API credentials and the Node.js library for web scraping installed, let’s begin setting up the "scraper.js" file. Choose the Apartments.com page you wish to scrape data from—let's focus on the House Rental page for this example. In the "scraper.js" file, use Node.js along with the fs library to extract data from the specified Apartments.com page and store it in “response.html” file. Make sure to replace the placeholder URL in the code with the actual URL you intend to scrape.

JS Code:

const { CrawlingAPI } = require('crawlbase'),
  fs = require('fs'),
  crawlbaseToken = 'YOUR_CRAWLBASE_TOKEN',
  api = new CrawlingAPI({ token: crawlbaseToken }),
  apartmentPageURL = 'https://www.apartments.com/2630-n-hamlin-ave-chicago-il/kvl7tm9/';

api.get(apartmentPageURL).then(handleCrawlResponse).catch(handleCrawlError);

function handleCrawlResponse(response) {
  if (response.statusCode === 200) {
    fs.writeFileSync('response.html', response.body);
    console.log('HTML saved to response.html');
  }
}

function handleCrawlError(error) {
  console.error(error);
}
Enter fullscreen mode Exit fullscreen mode

The provided code snippet uses the Crawlbase library to extract HTML content from an Apartments.com web page. The script begins by creating a Crawling API instance with a specified token, then sends a GET request to the Apartments.com page. If the response is successful with a status code of 200, it saves the HTML content to a file named "response.html". If any errors occur during the crawling process, the script logs the error message to the console.

HTML Output:

Scraping HTML Apartments.com 'Scraping HTML Apartments.com'

Step 4: Scrape Apartments.com in JSON format

In this section, we'll explore how to scrape valuable data from an Apartments.com web page. The data we aim to scrape includes property title, description, price, location, features, size, and more. To achieve this, we'll create a Apartments.com scraper using two libraries: cheerio, commonly used for web scraping, and fs, which assists with file operations. The script will parse the HTML of the Apartments.com page, extract the desired details, and store them in a JSON array.

JS Code:

const fs = require('fs');
const cheerio = require('cheerio');

// Read the HTML file
const html = fs.readFileSync('response.html', 'utf-8');

// Load HTML content into Cheerio
const $ = cheerio.load(html);

// Extract property details using Cheerio selectors
const propertyDetails = {};

// Get property name
propertyDetails.name = $('#propertyName').text().trim();

// Get monthly rent
propertyDetails.rent = $('.rentInfoDetail').eq(0).text().trim();

// Get number of bedrooms
propertyDetails.bedrooms = $('.rentInfoDetail').eq(1).text().trim();

// Get number of bathrooms
propertyDetails.bathrooms = $('.rentInfoDetail').eq(2).text().trim();

// Get property size
propertyDetails.size = $('.rentInfoDetail').eq(3).text().trim();

// Get lease details and parse into structured fields
const leaseDetailsText = $('.detailsTextWrapper').text().trim();
const leaseDetailsParts = leaseDetailsText.split(',').map((part) => part.trim());
const leaseDetails = {};
leaseDetails.leaseDuration = leaseDetailsParts.find((part) => part.toLowerCase().includes('month lease')) || '';
leaseDetails.depositAmount = leaseDetailsParts.find((part) => part.toLowerCase().includes('deposit')) || '';
leaseDetails.availability = leaseDetailsParts.find((part) => part.toLowerCase().includes('available')) || '';
propertyDetails.leaseDetails = leaseDetails;

// Extract property location (address)
const propertyAddressElement = $('.propertyAddress');
if (propertyAddressElement.length > 0) {
  const addressComponents = propertyAddressElement
    .find('span')
    .map((index, element) => {
      return $(element).text().trim();
    })
    .get();
  const location = addressComponents.join(', ');
  propertyDetails.location = location;
}

// Extract house features (amenities)
const amenitiesSection = $('#amenitiesSection');
if (amenitiesSection.length > 0) {
  const houseFeatures = [];
  amenitiesSection.find('.combinedAmenitiesList .specInfo span').each((index, element) => {
    const feature = $(element).text().trim();
    houseFeatures.push(feature);
  });
  propertyDetails.houseFeatures = houseFeatures;
}

// Extract property description
const descriptionSection = $('#descriptionSection');
if (descriptionSection.length > 0) {
  const descriptionText = descriptionSection.find('p:first-of-type').text().trim();
  // Replace newline characters (\n) with spaces
  propertyDetails.description = descriptionText.replace(/\n/g, ' ');
}

// Convert propertyDetails object to JSON format
const propertyDetailsJSON = JSON.stringify(propertyDetails, null, 2);

// Output the JSON data to the terminal
console.log(propertyDetailsJSON);
Enter fullscreen mode Exit fullscreen mode

The provided JavaScript code creates a custom Apartments.com scraper which uses Cheerio to scrape and extract property details from an HTML file. It parses the response.html file to scrape data such as property name, monthly rent, bedrooms, bathrooms, size, lease details (duration, deposit, availability), location (address), house features (amenities), and description. The code leverages Cheerio selectors to navigate through the HTML structure, extract specific elements and text content, and format the extracted data into a structured JSON object.

JSON Output:

{
  "name": "2630 N Hamlin Ave",
  "rent": "$2,350",
  "bedrooms": "2 bd",
  "bathrooms": "1 ba",
  "size": "1,000 sq ft",
  "leaseDetails": {
    "leaseDuration": "12 Month Lease",
    "depositAmount": "350 deposit",
    "availability": "Available Now"
  },
  "location": "Property Address:, 2630 N Hamlin Ave, Chicago, IL, 60647",
  "houseFeatures": ["Air Conditioning", "Dishwasher", "Basement", "Laundry Facilities"],
  "description": "Charming 2-Bedroom Home in Vibrant Logan Square - Recently Renovated Discover the blend of modern living and neighborhood charm at 2630 N Hamlin Ave. This recently renovated first-floor residence in a classic Chicago 2-flat boasts 1,000 sq ft of contemporary design with new hardwood flooring and updated appliances. Features: 2 welcoming bedrooms and 1 stylish bathroom New hardwood flooring throughout for a sleek look Updated appliances to enhance your culinary experience Spacious 2-car garage - additional $150/month per car Private backyard - your urban sanctuary Laundry convenience with facilities in the basement Small pets allowed Rent: $2350/month (includes utilities: gas, water) Utility Threshold: Utilities are included up to a threshold If usage exceeds this limit, the tenant will pay the difference over specified threshold Prime Location: Grocery: Tony's Fresh Market for your daily needs, just a short walk away. Transit: Diversey & Hamlin Bus Stop within 3 minutes, and Healy Metra Station a 12-minute walk. Blue Line: Logan Square Station, a quick 2-minute drive, connects you to downtown Chicago and beyond. Dining: Savor local flavors at Omarcito's Latin Cafe, L'Patron, and The Little Pickle. Nightlife: Enjoy evenings at Surge Billiards for a game night. This inviting space is ready for you to make it your own. Contact us to arrange a viewing and start your new chapter in Logan Square!"
}
Enter fullscreen mode Exit fullscreen mode

Final Words

This guide offers resources and techniques for scraping data from Apartments.com using JavaScript and the Crawlbase Crawling API. You can gather various types of data, such as property title, description, price, location, features, size, and more. Whether you're new to web scraping or have some experience, these insights will assist you in getting started. If you're interested in scraping data from other websites like Zillow, Redfin, Trulia, or Realtor, we also provide additional guides for you to explore.

Additional Guides:

How to Scrape Craigslist

How to Scrape Websites with ChatGPT

How to Scrape TikTok

Scrape Wikipedia in Python - Ultimate Tutorial

How to Scrape Google News using Smart Proxy

Frequently Asked Questions

Can you scrape Apartments.com?

Apartments.com can be scraped for real estate data using web scraping tools such as Crawlbase. Crawlbase is useful for scraping apartment listings, pricing, and descriptions from Apartments.com. Developers can use Crawlbase's features to browse the site structure, send HTTP requests, and parse HTML to extract specific property details. However, it is critical to follow Apartments.com's terms of service and employ ethical scraping practices. Use Crawlbase responsibly to scrape useful information from Apartments.com for a variety of applications.

Is it legal to scrape Apartments.com?

Whether scraping Apartments.com is legal depends on their terms of service. Generally, it's okay to scrape public data like apartment listings and rental prices if you follow the rules and don't violate the website's terms. However, scraping for business reasons or in large amounts might need permission. Always read Apartments.com's terms and consider legal advice if you're unsure.

What data can I scrape from Apartments.com?

The data you can scrape from Apartments.com includes apartment listings, rental prices, property features like the number of bedrooms, bathrooms, square footage, amenities such as parking availability, gym facilities; location details like neighborhood, city, and state, while contact information consists of landlords or property managers.

How do I handle CAPTCHAs when scraping Apartments.com?

Dealing with CAPTCHAs while scraping websites such as Apartments.com can be tough, but it's easier with the right tools. Services like Crawlbase's Crawling API use smart algorithms and artificial intelligence to solve CAPTCHAs automatically. This means your scraper can keep working smoothly without needing you to solve each CAPTCHA by hand. With this automation, your scraping process stays efficient and productive, letting you get the data you want without getting stuck on CAPTCHAs.

How do I prevent getting blocked while scraping Apartments.com?

To avoid getting blocked while scraping Apartments.com, use tools like Crawlbase's Crawling API. This service helps prevent blocks and bypass CAPTCHAs automatically using advanced technology. Crawlbase also provides proxy management and geolocation features, spreading requests across different IP addresses and locations.

How do I format and store the scraped data from Apartments.com?

Once you've scraped data from Apartments.com, you can organize it into CSV or JSON formats using programming languages like JavaScript. Store this formatted data in databases such as MySQL or PostgreSQL for convenient access and analysis. This method ensures efficient data management and retrieval for future utilization.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Terabox Video Player