Centralized version control systems like Subversion/SVN use repositories (or "repos") as the top-level storage area for your project data. In this video, you can talk about what a repository is, where a repository is located, and how you can access it from your workstation so you can share your projects with a team.
- [Instructor] At the heart of your version control system is the repository, also known as a repo. The repository is the place where all your files are stored. You can think of a repository as being like a file server. Your files and projects are stored in a folder structure just like you would store them on your file server. Individual projects in a repository might have their own folder and subfolder structure, and a repository can host multiple projects. The difference between a file server and a repository though is that a repository can time travel.
It stores not only the current versions of your files but also all the previous versions of your files. In fact, you can normally retrieve any previous version of any file from your repository as long as it was stored there in the first place. Because the repository has information about all your file revisions, it can also answer some interesting questions that a regular file backup can't answer. For example, it can tell you who changed a file. So if you're working on a team and a file gets modified, it's trivial to go back and find out who changed it and when.
It can tell you what changed in a file as long as it's a text file of some sort. If you know that one of your pieces of code got changed last Tuesday, for example, you can use the repository to see exactly which lines of code got changed. If you need a very specific older version of a file or an older version of an entire project, the repository makes your old files easy to find and retrieve. It's much easier than a standard file backup in most cases because you have access to your entire project history, all in one place.
There are two general types of repositories used by version control systems. The first type is a central repository, where all the files and revisions are stored on a server. I use the term server kind of loosely here because a central repository can be a piece of software running on your local machine, and it doesn't specifically have to be a dedicated server. If the server is remote, which is usually the case, a version control client can connect to it across a network in much the same way that an internet browser connects to a web server.
When you have changes that are ready to be stored in the repository, you can use your client to send those changes to the server. With a central repository setup, the server is generally the piece that does the hard work of determining what files were changed and how they were changed. So if you need to come back later and find out what the difference is between two versions of a file, the server might make that calculation and send you the result. Your local client can be kind of a dumb client that simply makes requests and listens for responses. Central repositories are used by version control systems like Subversion and CVS.
The other type of a repository is a distributed repository, where all of the files are stored on the client machines of people who work on the code. A server isn't even required. Each client is essentially its own full local repository. You can have a server in a distributed repository setup, but if you do, the server acts just like any other client. Clients can connect to each other across a network in order to share changes. However, unlike a centralized repository, where you send entire files to the server, with a distributed repository, the clients only send changesets to each other.
So if you change two lines of a three megabyte file and you save the changes in a distributed repository, only the two changed lines of code are saved. You don't have to save an entirely new copy of the file. This can make the distributed repositories much smaller and much faster when doing certain kinds of operations because they're dealing with less data. They also require much smarter clients because the clients are required to do all the work in terms of determining exactly what changed between two revisions of a file. Version control systems like Git and Mercurial use distributed repositories.
Despite their structural differences, these two types of systems have a number of similarities. First, when you make changes to your code throughout the day, you're never making the changes directly in the repository. You always work on local copies of your code. Later, when you're ready to save those changes to the repository, you merge those changes in. In fact, if you work all day on a bunch of changes and you decide at the end of the day you need to start over, you don't have to save your changes to the repository at all. You can just get a fresh copy of the latest files from the repository and start again.
Also, the editing software you use does not have to be aware of your repository. Since you're always working on local copies of the files, you can use any editing software you want. It doesn't have to be software that's able to work with the version control system, and it doesn't even have to know what a repository is. After you've made your changes and you're ready to save them to the repository, your version control software can work directly against the local files on your machine, completely independent of your editing software. So which kind of repository is better? Ehh, that's kind of a religious debate, and I think that both types of systems have their merits.
Both systems were written specifically to aid programmers, so both types of systems will work well in terms of managing your code. In fact, if you have a good version control client, and especially if you have good version control software integration in your programming tools, it's sometimes hard to tell the difference between a centralized system and a distributed one. The version control clients tend to abstract away those different ways of storing data. We'll be focusing on Subversion, so we'll necessarily be talking about centralized repositories from here on out.
However, we'll also touch a little on Git integration and how centralized and distributed systems can actually work together.
- Trunks, tags, and branches
- Checkout, commits, and revisions
- Merging, locking, and working with a team
- TortoiseSVN on Windows
- SVN integration with Eclipse
- Connecting to a project
- Creating a new Java project in Eclipse
- Connecting to an existing Java project using Eclipse
- Dealing with projects that move to a new location
- Making changes and creating branches
- Tracking changes and dealing with conflicts
- Creating a release