My Top 5 Do’s & Top 5 Don’ts
If there is anything I want you to take away from this document, it is the explicitly of these points!
The Do’s
- Use a master workbook to store all needed data
- Upload files to relevant project Teams Channel
- Keep all of your Success & Error files stored in a folder in teams, that way you don’t lose anything when you do future projects and might need to reflect on what went on.
- All required fields have the API name as column header, and appropriate data must be populated for every record.
- Have a unique key for all your records (Concat multiple unique fields together)
- The second you receive any data – you are a data processor. So take care of the data, and don’t let it be exposed! Only take risks you are comfortable with.
The Don’ts
- DO NOT EDIT IN .csv. Just don’t. Never acceptable. Please.
- Do not have any empty columns – Dataloader will fall over.
- Do not allow duplicates to slip through into your upload
- Do not just wing it. No cowboy approaches with data. Take your time and do a good job.

Salesforce Dataload Masterclass – Best Practices Guide (Free PDF)
Master Salesforce Data Loading with this expert guide! Learn best practices, avoid common pitfalls, and optimize your data imports using Data Loader, Workbench, and more.
Methods for getting data into Salesforce
Doesn’t matter who you ask, somebody will tell you a new way of uploading data to Salesforce you’ve never heard of, and whilst none of them are technically incorrect, there are definitely some best practices we can stick to as a company to try and remove any errors.
Standard Methods
Dataloader
This is my personal favourite tool, and the tool which I will be using to teach best practices in this document. You can install it easily to your desktop from your Salesforce Org, and doesn’t need much setting up
Dataloader.io
This is basically dataloader, but it’s the web version. They are very similar, but the main difference is this is the restricted version of dataloader, and you have to pay to upload more records, ewww.
Data Import Wizard
To be honest with you, I don’t think I’ve used this tool since I first started using Salesforce, I’m sure it has its pros and cons, but if you want to find out more you will have to ask somebody else or google it. I do know you can find this in your org setup though.
Non-Standard Methods
NPSP Data importer
Similar to the Data import Wizard, I wouldn’t really use this, even if you have NPSP. BUT… the wicked feature this tool has is before it finally uploads, it places your data in a holding pen, so you can say upload one .csv file with accounts and contacts, and then split them out in this holding pen before upload!
Code
You can use code to upload data into your org, a good example of this is CumulusCI to produce test data.
API
If you have data sources other than Salesforce and want to integrate them, you will need an API to do this. Once you have it set up it can read and write data between the two sources.
Salesforce Inspector
The Salesforce Inspector is a powerful Google Chrome extension that has many features, one of which, is its capabilities to export and import data! The export functionality works with a simple SOQL, and the import, by pasting in data from an excel/csv.
Steps for Data cleansing
How to install Dataloader
Multiple ways, as with literally everything I talk about in this document, but if you go to an org, and in setup search for “data loader” or something like that, it will bring you back a place you can install it! Simplezzz.
If you use the web version of dataloader, I believe there is a limit to how much you can install, which we do not want! Imagine getting halfway through your upload and it tells you “no more”!!! Bad times indeed.
Oh you might also need to install something called Zulu, but I’m not advising on this, speak to your IT technician if you want to install it Much safer way of doing it!
Data cleansing Pre-steps
In this section I will be going through in depth, the steps taken for data-loading/cleansing and the reason behind it.
Requesting the data
When you request data from a client, you should always give them a template to work with, this makes it easier for them, but also easier for us! A lot easier for us in fact!
Data Template’s
- Make it explicit why you are using a template, and why they need to provide the data in the correct way
- It is so important we are given the data in the correct way, as it takes the accountability away from us. If they give us the data in a poor format, that means we have to update it/manipulate it to put it in the correct format- this increases the room for error.
- Templates should separate out objects, to avoid further work, for example if they have Accounts and Contacts, these should either be in separate sheets in one excel workbook, or completely separate excel workbooks.
- If they are sending it in .CSV, you might want to get them to check it after saving as a .CSV to make sure it is all in the correct formatting. After saving to .CSV it scrubs all the formatting so some data could get tangled, which will lead to us uploading it incorrectly
- At some point I will create a fairly standard Data Template you can send to clients, but as you can imaging it wont be too easy as all clients have different data. It is good to send examples of what data should look like, just don’t bombard the client with files. The aim of this Is to promote them using our template, which means we have to help them & make it as easy as possible.
- Provide a GOOD, well thought out template, don’t just throw them some excel sheet with a few column headers. Remember, by putting in the effort to create a decent, personalised template, we are saving ourselves the hassle of fixing formatting, or even loading in wrong data.
Communication
Communication is probably the most important part of getting the data from the client, you will need to hold their hand through the data process, as a lot of the time they wont have anybody experienced inside the business to handle this.
As well as letting them know how explicit the template is, you might want to try and set their expectations on the whole data process, as a lot of clients will have never done this before. Here are a few things you might want to mention to help set these expectations:
- This is probably the most important phase of a project. The whole reason you have Salesforce is to house and maintain data correctly. If you are putting rubbish data in, you will get rubbish data out!!! (GIGO- Garbage in = Garbage out).
- It is going to take a lot of time! A lot of companies will take weeks or even months to transition their data from one system to another, they will hire data scientists, analysts & engineers, because it is a big job! Obviously don’t scare the client, but it is going to need a fair bit of time off both you and the client.
- Data needs to be transfered securely, the best way to do this is over a secure FTP service, which you can get for free. When you email data – it is passed from one computer, onto an email server, to another, and over to another computer, thats potentially 4 touchpoints. Also please don’t be storing customer data on your laptop. If anything is to go wrong and any audits are done – you will be liable as you are the data processor. SO TAKE CARE!
Pre-Checks
When you receive the data, its important to quickly give it the once over, make sure nothing sticks out straight away
- These checks don’t have to be in-depth
- It is much better to flag these things early, then getting half way into your data cleansing and realise you can’t work with part of the data.
- Remember- the client knows their data better then you do, or at least you hope they do, so its much better they make these corrections then you doing it!
- Don’t be afraid to push the data back to them and tell them, as nicely/supportively as possible, that you need them to fix something. Support them though this process, tell them what they need to fix it, if you know how- tell them some Excel formulas they can use, anything to promote them changing the data to meet your requirements.
Setting up your workspace
In my opinion, there are two main types of dataload/cleanse, when dealing with existing data & an existing Salesforce Org and when starting with completely new data that they have provided from an old system/spreadsheets. Each one obviously needs to be handled slightly different, but the best practices apply to both/all data manipulations.
For quick steps/best practices & examples on Dataloading, go to the Simple Steps for Visual Learners section.
We begin.
Excel Formatting
Save source data as Excel Workbook format [.xlsx] even if the client provides ‘csv ready data’.
Reasons not to use a csv
- csv format doesn’t save what’s on the screen in Excel
- Excel will happily display formatting information and data and allow you to “save” it to csv and actually throw all your hard work away.
- All formatting is lost on CSV, colours cell type, date formats, cell formula, column widths, vlookup data
- Multiple sheet’s unsupported in csv.
- A single byte character (e.g a comma) can change the entire data structure of a csv file.
Storing your files
Before starting, create a master workbook and upload it to the projects Teams Channel, in the Files Tab. Make sure you use a good naming convention and use folders where needed.
Reasons for having one master workbook
- Its clean, you don’t end up having loads of documents, with a naming convention which eventually gets too confusing
- Easy to produce VLOOKUP()’s as everything you need is in one place
- You know where to look- if you need to reference some data, you know where it is
Reasons for uploading documents to Teams
- When we do a project, we will have a teams channel for it, its important to have documents in one shared place
- If multiple people are working on a project, it allows them to work on the data too
- You can have a shared document if its on teams, allowing multiple people to work on it at once
- If future projects happen for that client, people can see/reference back to any work you have done
- Upload all your exports to the teams chat, that way if anything goes wrong, or the client wants a copy of all their data before you did any uploads, they have it!
- Use a sensible naming convention, use files where necessary to group types of documents together (for example you might want a Dataload folder, with a sub folder named “Exports”). Also make sure the templates have a good naming convention, for example for an export file, you might want to put the date and time It was exported, and what it contains.
Starting your Data Cleansing/Manipulation
So, you have collated all your data, or at least the data you need for now, into your master workbook, and have made the work book a live document, in .xlsx.
With any data manipulation you will want to start with the parent level source data. E.g Accounts.
Copy this data into a new sheet in your master document and give the sheet a relevant name, E.g. “Accounts”.
Map the fields to the correct API Field names
When you have your data in your spreadsheet, you will probably have field labels or it might just be some name the client has given the field so they understand it (for example a lead Status field might be named Stage in the data they provided) so its important we match these to the API names to avoid confusion when doing VLOOKUP’s and eventually go to upload the data.
Reasons to map fields to the corresponding API Name
- After having worked with Salesforce for a while, especially when coming into a previously used org, you know that field LABELs are often duplicated, or at least very similar to a different field, when dataloading this makes it sometimes hard to tell the difference between two fields, promoting mix-ups
- When you eventually go to upload the data, its important that the names are already mapped, this way you can use the AutoMap feature of Dataloader and there is no mix ups (I would still run down the mappings though once auto-mapped and cross check).
- Just because a field label makes sense to you, doesn’t mean it will make sense to somebody else, use the API name, then nobody gets mixed up
Field Formatting
Sometimes dataloader can be a bit funny with the formatting of certain field types, a classic example im sure a lot of you are aware of it Date format, which does it want? 01/02/03? 01/02/2003? Wednesday 2nd February 2003? There are a lot of variables in this, and its important to have a place you can go to check your data is in the right format.
How do you even format a field in excel? Make your selection (usually you will just select the column), right click, and press “Format Cells”. In here you will see field formats, as well as sub field formatting’s
Date Fields
YYYY-MM-DD
Date/Time Fields
YYYY-MM-DD HH:MM:SS
This is assuming you have set up your Dataloader settings with your local time zone (more info on this further in).
Postcode/Zip Code Fields
If you are working with Zip Code data that has leading zeroes, excel will remove them zeroes. To keep them, change the cell format to Zip Code and it shouldn’t do that. Also remember when you convert to .csv from excel, you will to do this formatting again.
Phone Fields
You can just use the cell format “Phone Number”. Also works for Fax… If anybody still uses those things.
Email Fields
Make sure your emails are formatted correctly, Salesforce wont allow bad values in an email field, so check for Commas instead of Dots, or if an email has mailto: infront of the address (which is a lot more common than you would think).
Currency Fields
Remove all commas and currency symbols from the number. This cell formatting should just be number
Salesforce will not allow text in a currency or number field
Checkboxes
“1” or “TRUE” for Checked, and “0” or “False” for unchecked
I believe you can leave unchecked values as null, but best to play it safe and set them to “0”.
Address Fields
Make sure you are breaking down all the parts of an address into their corresponding fields if they are all in one cell. For example one cell could contain all Billing Address information, but it will need separating into its counter parts- Street, City, State/County, Post Code/ZIP, Country etc.
Picklists
Salesforce will allow any picklist values imported into them as long as the picklist is “Unrestricted” in its settings, which is pretty bad practice. So as standard, make sure all the values are the same as the Picklist’s Values in Salesforce.
ID’s
Nearly forgot about this one, but a lookup in Salesforce has to use the records ID, it cant use its name! This is where the very handy VLOOKUP() will come into play!
Exporting from Salesforce
There are a few ways to get data out of Salesforce, we can use multiple tools. But for simplicity and also best practice sake, I highly suggest (just do it anyway) using Dataloader (the desktop version, not the online version). Why? 18 Character ID’s that’s why!
Ill give you an example, you run a report in Salesforce, just your normal bog standard report will all the fields you need, with appropriate filtering applied, but oh wait, the ID field is bringing back the classic 15 Character ID. This will not work. You try to upload this, and dataloader will probably be sick!
VLOOKUPS
This is by far the most used formula when it comes to Dataloading or cleansing. If you are working with multiple objects, which have relations to other objects, you will be using VLOOKUP(). Incase you don’t know how to use one, google “VLOOKUP()” and either go to Exceljet or Microsoft’s website for tutorials on how to use it, but you will be exporting data from Salesforce and matching ID’s with unique fields!
This is how they work-
=VLOOKUP(lookup_value, table_array, col_index_num, [range_lookup])
Example-
=VLOOKUP(H3, B3:C13,2,FALSE)
To the right, To the right
IMPORANT- VLOOKUPS only look to the right! (Ignore Beyonce telling you to go to the left just this once!). This means the data you want to retrieve (result values) can appear in any clumn to the right of the lookup values. Very important this one.
Unique Keys
When we use a VLOOKUP we need some unique field to match our ID’s to. A lot of the time you will see people using the name of the record as the unique ID. The only problem with this is when you export data, sometimes it can skew certain characters, which wont match when you do the VLOOKUP. For example on a mac the ‘ is slightly different to on a windows (most of the time). Dataloader can do the same when it exports. Also you can have duplicate record names. Bad times.
So we create a unique key!!
Find 2 or 3 unique fields in your sheet, this might be D.O.B, Street and Last Name. What’s the chances somebody has the same date of birth, street AND last name? Probable? I think not.
Formula to concat fields-
=CONCAT(cell 1, cell 2, cell 3)
e.g. =CONTACT(a2,b2,text(c2))
Use Text where you have a different type of field, for example a date field.
**IMPORTANT**- a lot of people might not know how to copy a formula to all of the formulas below. Create your formula, press enter, and it will show your formula output. Then in the bottom right of that cell, hover over it and your curser should turn into a BLACK cross. Double click, it should copy to all populated rows below. If you set of data isn’t big you can also drag it down.
Datasets
This is one of the main reasons we use just one Workbook and not many excel sheets, is that we can easily use Datasets to avoid having duplicate columns throughout your workbook.
To set a dataset in Excel, highlight the two columns that have your unique key and id in them, then designate this as your dataset.
To name a dataset, look in the very top left of your excel sheet, you will see that on the left of your formula bar, there is a little box which probably has the column names of your selection. Rename that. For example If I want to make that the list of AccountID’s with my Unique key, I might name it “AccId&Key”. That way in my VLOOKUP() in the range, all I have to enter is that dataset. Cool ey.
Dataloading
Prepping for Dataloader
Dataloader can be picky at times, so hopefully if you have followed everything correctly up until now, you shouldn’t have too many problems.
Spotting Duplicates
A lot of the time, client has sent over data, but they have NOT checked for duplicates, as a result of your VLOOKUP these usually highlight themselves if there are any, but we also need to check before uploading.
To do this easily, we can use something called Conditional Highlighting.
Make your selection, under home you will see “Conditional Formatting”, in here press “Highlight Cells Rules” and “Duplicate Values”, set your colours, and press Ok.
This will highlight your duplicates, remove/edit where necessary
The only time your allowed to use .csv
Its time has finally come! I will allow you to use .csv.
Dataloader only allows .csv files to be uploaded, so we have to save our data as .csv before we can import.
Paste – Values
When you are done and ready to upload, create a new workbook, and paste all the data you need, and only the data you need! Don’t include columns that you aren’t uploading.
When you paste, make sure you Paste VALUES! This way any formulas, will paste their outcome, not the actual formula! Which is imperative.
Save as!
Once you are happy with it (always give it a quick scan), press File > Save As > .csv. Give it AN APPROPRIATE NAME, and save!
Personal Note- I like to close down that .csv, and reopen it. As I’ve previously said- .csv’s don’t copy all formatting and can sometimes warp data that has formatting applied, so I like to do another quick check to make sure it hasn’t messed with any of my dates, times, or any other formatting that was previously applied.
REMEMBER- don’t make any changes to a .csv. If you spot any errors, fix it in the master document and reapply the process we just went though.
Loading the data
The time has come to upload the data you have so carefully got ready.
Dataloader settings
You may have seen earlier I mentioned your Local Time Zone settings, this is an example of your Dataloader settings which you should make sure are set correctly. Usually you can just set these up once and they are good to go, but there are cases you might need to change them.
To get to settings, launch Dataloader, then in the very top bar click Settings.
Batch Size
The batch size is how many records get uploaded per batch. As standard I believe this is set to 500, but if a record it linking to a lot of other records, it might be best to reduce this number to like 200, all that will happen is it will slow down a little.
Time Zone
As I say, you probably wont need to touch this as I believe it pulls from your user settings, but If this is wrong, you will want to amend as all your date/time fields will be wrong by a few hours.
The rest of the settings in here you will probably never need to touch, but it doesn’t cause any harm knowing what your options are when consulting a client about data loading for some “unique” use cases.
Insert, Update, Upsert, Export, Delete
What’s the difference between these?
Insert
This is used for inserting new records. There is no updating of existing records, only new records are being created with this function. You will probably using this one the most. You will not need the ID of the record, as it doesn’t exist yet.
Update
This is where all the records you want to edit are already in Salesforce, and you just want to edit them. For this you will need the ID of the record to match your updates to it.
Upsert
This is a mix of both the update and insert. This is defined by whether in the Id column, it has a value or not. If it has a value, it will update the record, if not, it will create a record.
Export & Export All
Export – is what is says on the tin, exporting data, like a report. You sign in, select your objects, fields, criteria, then export it. If you’ve ever worked with SOQL this is what its using.
Export all – This will export all data in the org… Be warned, this will produce a hefty folder of files. Good for if the client wants their whole data structure stored elsewhere as a backup.
If the client wants a backup- I wouldn’t use this unless it is for a one off. There are other functions in Salesforce for scheduled backups which work a lot better.
Delete
This works similar to update, except instead of updating the records, it deletes them. If I was doing this I would just probably use a SOQL query in dev console to delete them. But if you are using this feature of dataloader, you will need to upload a .csv file with the ID’s of the records you want to bin. But hey don’t worry if you delete a record accidentally, did you know the recycling bin is now in Lightning? Haha (Salesforce Pun).
All jokes aside, although the Recycling bin is available, triple check the records before you bin them.
Do a test run
One thing I have recently started doing, is a test run. Yeah this adds more time to your dataload, but would you rather realise some data is wrong 3000 records into a load? Or 5 records in? After some horror stories I’ve seen and heard, id take the latter.
Before uploading just take a handful or records (like 3-5) and upload them to see what the outcome is, this is good for checking things like relationships, field formatting etc. And if it goes wrong, Its not hard to manually delete 5 or so records.
Why do it this way
- Single source of truth – no compound error
- Single Workbook requires no relative path’s to be saved to the workbook, which will inevitably not work for the next user.
- Repeatability. If something goes wrong at any stage of the process you can go back and retrace where the error was introduced and correct the affected data set.
- Accountability, if you can provide the customer with 1 workbook with all their data in it for them to check and verify then the onus is on them if there is missing or corrupted data down the line.
Full List of Do’s/Do nots
- DO NOT Manipulate data in csv format. For all the above reasons, please just don’t.
- DO NOT Delete the success file data without first taking the relevant ID’s back into the master workbook.
- DO NOT Blindly force the ‘failures’ csv file through data loader with altered data until it saves.
Feel free to add to the list of do’s & Donts

