I am currently working on source code that is over 5Gb in size. This is mostly due to a poorly thought out folder structure, there are code files, images and Excel files all jumbled together. I think a clear distinction should be made between source code and data.
I would define data as anything that is added to the project during its life. So if you have an upload option, anything that is uploaded I would describe as data. The site should still function without (or very little) data.
Images can fit into both groups. Any icons or images attached to the functionality of the project I would class as source code. However anything that is uploaded should be classed as Data.
The database should also be classed as both. The data, anything that is inside a database table should normally be classed as data. Stored Procedures, Functions and Views are all Source Code and would benefit from version control.
Source control is not an excuse not to backup things. Don’t just commit files to source control so you know you can restore them if you need to. Files in general in source control are there so you can see how they changed over time as the code base changed. Files in you backup are a snapshot of what the application was at a point in time and will include ALL the data.
One last point before I end. If you are hosting on a Cloud Computing platform like Azure it gives you an easy way to distinguish between Data and Code.
Anything in your
- Web App = Code
- Blob Storage = Data
- SQL = Data/Code
Each project is unique and there will always be exceptions to these suggestions but I think this is a good goal to have. What do you think?