Download SAMPLE database (one make)
Make, model, version, no specs (5 columns)
Make, model, version, basic specs (26 columns)
Make, model, version, full specs & features (188 columns)
Alternate formats: CSV and SQL (full specs & features)
Scroll down for bikes, trucks, buses, car dealers and other stuff.
Buy FULL database (all makes) + FREE monthly updates:
You can purchase a smaller package and later pay the difference to a bigger package.
Contact me for custom packages, a selection of columns or specific makes only. See list of updates.
Description
In August 2015 I learned to scrap data from websites, the India car database was my first scraping project, as opposite of the manually-written European car databases that I am making since 2003. Lots of people were asking me if I have an indian car database, that times nobody was selling.
Source of data: Carwale.com, which removed discontinued models in March 2017, so my database contains valuable data that you cannot get yourself from Carwale anymore. Unfortunately India do not have a quality car website that display years produced for every car version (as I personally wanted). Database contains production years indicated only for discontinued models and only if there are multiple generations of same model. Not good in my opinion but there was no other alternative. HAD TO DO IT in this way. If you know other better source of data please tell me.
Personally I viewed web scraping as copyright issue and cheating way to make databases, but most people are not against web scraping. Personally I am interested only in CARS, but people asked me if I can scrap data for BIKES, TRUCKS, BUSES, car dealers and other stuff. And I did! For additional data scraping please ask!
India database coverage
Mid-1990s to present, meaning that the database is complete for all cars ever imported in India.
Do note India car market boomed only in late 1990s and especially during 2000s. In early 1990s the ONLY cars available were Maruti (dominant passenger cars), Mahindra and Tata (off-road and commercial vehicles), Hindustan and Premier (outdated passenger cars).
Daewoo was the FIRST foreign car manufacturer to enter in Indian market (1994), followed by Ford and Opel (1996), Fiat (1997), Honda and Hyundai (1998).
List of makes included
Ashok Leyland, Aston Martin, Audi, Bentley, BMW, Bugatti, Caterham, Chevrolet, Chrysler, Datsun, Daewoo, DC, Eicher Polaris, Ferrari, Fiat, Force Motors, Ford, Honda, Hindustan Motors, Hyundai, ICML, Isuzu, Jaguar, Jeep, Lamborghini, Land Rover, Lexus, Mahindra, Mahindra Renault, Maini, Maruti Suzuki, Maserati, Maybach, Mercedes-Benz, Mini, Mitsubishi, Opel, Nissan, Porsche, Premier, Renault, Rolls-Royce, San, Skoda, Ssangyong, Tata, Toyota, Volkswagen, Volvo, Willys.
Expected to be launched in India in 2018: Acura, Infiniti, Tesla.
Data fields included
Naming: ID, Make, Model, Version, Status 100%.
Price: Production cars 30.65%, Discontinued cars (last recorded price) 69.35%.
Body: Length (mm) 99.70%, Width (mm) 99.70%, Height (mm) 99.67%, Wheelbase (mm) 99.54%, Ground clearance (mm) 59.27%, Kerb weight (kg) 61.11%, Bootspace (litres) 48.75%, No of doors 99.54%, Seating capacity 99.57%, No of seating rows 72.36%.
Engine: Displacement (cc) 99.21%, Max power (bhp) 99.43%, Max power (rpm) 99.21%, Max torque (Nm) 99.43%, Max torque (rpm) 99.73%, Transmission type 99.78%, No of gears 97.15%, Drivetrain 86.74%, Engine type 87.39%, Cylinders 72.17%, Bore x Stroke (mm) 13.18%, Compression ratio 9.16%, Valves per cylinder 69.73%, Dual clutch 60.92%, Sport mode 61.74%, Fuel system 31.33%, Turbocharger/supercharger 50.60%, Turbocharge type 50.16%, Driving modes 51.01%, Manual shifting for automatic 50.03%, Engine start-stop 49.78%.
Fuel: Fuel type 99.67%, Alternate fuel type 63.53%, Mileage (kmpl) 87.31%, Fuel tank capacity (litres) 96.30%.
Drivetrain: Suspension front 91.30%, Suspension rear 90.68%, Brake type front 98.48%, Brake type rear 98.23%, Steering type 50.87%, Turning radius (m) 80.14%, Wheels 50.95%, Spare wheel 68.53%, Tyres front 71.28%, Tyres rear 71.22%.
Others: Colour names 93.89%, Colour RGB 93.89%, Image URL 87.85% (you can use Tab Save extension for Chrome to download image files).
Features: 131 columns, see SAMPLE file, I do not list them here to overload the page with too much text.
Bonus: Car class, Body style 100.00% (added manually from my personal experience, NOT scraped from website).
Percentages as 1 January 2017 (3680 cars).
How was made & quality notes
While the European car database is made via manual data entry (thus adding new data constantly), and American car database is a mix between manual work and automatic web scraping, the Indian car database is just a web scraping job.
Good news: easy to update. Bad news: I sell “as it is” without corrections or filling missing data. A customer asked me if I can add body type column (hatchback, sedan, convertible, etc). I can add for a fee, but at each update any additional data will be lost, so I would need to charge the adding fee again = not cost-effective.
Typical programmers charge $300-500 to build a web scraper (example), plus $50-100 for each update. You can buy ready data from me for $30-120 with free monthly updates for one year.
is OK to scrap data from websites using automatic software, instead of compiling data manually in Excel?
Future updates will be only for new cars… sorry for this
Indian car database launched for sale in August 2015, and after doing ~8 sales and 3 updates, I decided in May 2016 to offer monthly updates on 1st day of month. Each update takes 8 fours of data scraping + 2 hours post-scraping manual work. I scrap every make to get models, every model URL to get versions, every version URL to get specifications, remove all data from previous update and put new data. See list of updates.
In February 2017 Carwale decided to hide discontinued models from website. I kept updating database by scraping for new versions URL, add them into database, compare the unique ID, delete duplicates, then scrap all versions URL to get specifications including current price and last recorded price that indicate whenever a car is in production or discontinued.
In November 2017 Carwale removed unique ID from each URL, causing all URLs to be changed and redirected, in 10 cases the old version URLs redirect to 404 Not Found, in 197 cases the old version URLs is redirecting to a different car that it should (multiple old URLs redirect to same new URL), making me impossible to re-scrap old cars for updates without risking loss of model versions. I will continue to update database monthly by adding new cars, but without updates for old cars, database quality will go down over time, when a model is discontinued and replaced with a new model with same name, discontinued model name will not be updated with production years to differentiate from current production model, specifications including prices will also not be updated for old cars, etc.
India bikes database
One of the first people who bought the Indian car database in August 2015 wanted a 2-wheeler database too. Initially I said NO because I was personally interested only in cars, but once I mastered my data scraping skills, I decided to offer web scraping services for individual customers. In January 2016 another customer wanted to scrap bike specs from Bikewale.com, this was the moment I created bike database . I update every few months when I got new sales.
Initially in 2016 Bikewale was showing only current production models, the discontinued models being added near end-2016 but I was not aware of until October 2017. I updated database raising number of models from ~250 to over 600. Note: if a bike have multiple versions, database contains specs of base version, because the versions do not have a distinct URL, proper scraping is impossible.
Download FREE sample
Bikes Database.xls
Buy full database
Dealers database
Several people told me to scrap dealer information from Carwale and Bikewale. Here is the database containing dealer name, street address, email and phone number.
Download samples:
India-Car-Dealers-SAMPLE.xls
India-Bike-Dealers-SAMPLE.xls
Buy full database:
Trucks and buses
I made them in September 2016 for a customer who never paid for the job he requested me to do. First person to purchase them came in January 2018 so I updated them for first time, future updates will be done at request.
As August 2018 I noticed that CarDekho made each version URL to redirect to main model URL, effectively making me impossible to scrap specifications of other versions than base version. Probably I will never update them again. Only 3 people purchased them so does not worth my effort to keep maintaining.
Download FREE samples
Trucks Database.xls and Buses Database.xls – Easy job, source of data: CarDekho.com.
Buy FULL database
Other databases
CarWale On-Road Prices.csv – A very difficult job which involved about 20 hours of coding in Visual Basic to make an application sending javascript requests to CarWale website to get price of each car in each city, application works at a rate of 2 requests per second, so 3100 cars × 510 cities = 1632000 seconds = 226 hours needed to get all on-road prices, RTO tax and insurance. The empty cells is because certain cars are available only in selected cities (Aston Martin only in Mumbai, Maruti is available in 500+ cities).
Due to time required to scrap all cities, I done future updates MONTHLY for a shortened list of 47 cities that takes about 22 hours and asked customer to pay for each update.
Goods and Services Tax was implemented on 1 July 2017 and harmonized car prices across India, the customer told me that updates are no longer required.
Used cars – Scraping used cars websites is a big stupidity in my opinion, since they are never complete, cars are removed from site once their owners found a buyer, you are not getting any usable data, but someone told me specifically to scrap an indian used cars website. CarDekho was the easiest to scrap, and have 45000+ cars.
Buy FULL database
WARNING for people asking about car owners
A number of people have trouble understanding what I am selling or don’t bother to check samples of what I am selling (database of car MODELS), they ask me straight to sell them a database of car OWNERS with registration number, name, address, profession, phone, email, insurance expiry date, etc. Strangely I do not get such questions from Europe and America, but ONLY from India.
I DO NOT have registration / owners data, and the companies who does have (car dealerships and insurance companies) must follow personal data protection laws and DO NOT share data of their customers to third-parties.
If you do a google search “car owners database” you see at least 10 sites selling personal data illegally, all them from India, the only country in the world where people have no respect for personal data and email/SMS spamming is like a national sport. But I am skeptical about how real is this data, how it was obtained and how updated it is, considering that vehicle registration authority do not keep records of emails and phone numbers, but only of residence address. Furthermore, most drivers do not even use email!
The only way to get car registration data legally and up-to-date is to apply in vahan.nic.in “The Ministry has decided to offer the services to different stake holders like Banks, Insurance Companies etc on payment basis.” If you do apply, please inform me what are their prices.