Join Chander Dhall for an in-depth discussion in this video User-defined functions (UDFs), part of Cosmos DB: Import, Manipulate, Index and Query.
Inside this data, I can have something like products, and I can have multiple of these products, and there's really not much I could do here, but let's investigate the data more. For example, funding rounds. As you can see, they have a raised amount and that's quite a high dollar amount there, and then the raised currency is US dollars. They raised that money in 2006. And then they have more funding rounds, as you can see, and there's some money associated with it. Now let's assume that you don't want to get this data out or calculate this data in the database, but what you might want to do is use a user-defined function that calculates this data out for you any time you invoke that function.
So how can we do this? Going back to Visual Studio, TotalFunding.js, I'm going to call this thing getTotalFunding and I'm passing in an object. This is a JSON object, I'm just calling it company. And then we're going to get rid of all this code and just say, var raisedAmount and start with zero and then we're going to keep adding that.
And then, if company can't be found for some reasons, then we have a problem, or company.funding_rounds, if there is no value there, then, of course, we have a problem, so we're going to say return zero, because if the company is a null object or company.funding_rounds is null, well, then we literally have no funding. Well, next, what we're going to do, is we're going to get the length of this array, 'cause it could be they got funded one time or 10 times or even zero times.
But in this case, we're looking for one or more values, so we get the funding rounds and then we have the length. That should give us the count and our for loop will basically eat right through this, go all the way to the array length, and do a very simple math. I'll take the company.funding_rounds and then take that particular funding round and the value associated with it, which was raised amount.
We take that value, and we also make sure we take this raised amount, paste it here, 'cause we want to keep incrementing this value, and then return the same value. Now, this is a very simple user-defined function or a UDF. You could use it for calculating taxes. For example, you have a zip code and for every zip code, you might have a completely different sales tax in that zip code, or even this tax that we have in the zip code, could also vary by the country.
So for United States, you could say if United States, these many zip codes, these are the different sales tax that apply. Or you could do Germany, has a very different kind of a sales tax. Any of that could be done. This is just a small example to show you that we can take any complicated object, do the math on it, and then we can use this in our C# code. Now that we've created our user-defined function, which is total funding, we'll come and change some of the code in here.
I'm going to get rid of this code and then we can start from scratch. In order to call that particular UDF, we need static async function, I'm going to call it GetTotalFunding, and then I could do it either ways, but what I like to do is get the UDF ID passed to the function and then have the database ID and the collection ID.
Then we're going to have the collection, similar to what we've been doing, ReadDocumentCollectionAsync and then use UriFactory.CreateDocumentCollectionUri and use the database ID and the collection ID. And then we can use that collection to get the SelfLink, and since we have the UDF ID, I like to create the filename from this.
We're doing the same exact thing except previously, if you remember, we had the script file path, and then we created the ID out of it, and this is just another way of doing the same thing. Most of the times, you would not need to do any of this out coding. This is just for demo purposes. And maybe you can pass in the udfId here. Well then, if you remember, we created a stored proc, in this case, we're going to create this UserDefinedFunction, and then give it the ID, which in this case, happens to be udfId.
And the body, pretty similar to what we did earlier, ReadAllText, and in this case, it's udfFileName. Well, next thing, what we want to do, is we also want to try deleting that UDF, and I could create a function in this case and text is collectionSelfLink, and of course the udf.Id.
Do Control + Dot, have Visual Studio create the matter for us, and then we'll fix the signature a bit. It's static, we want to make it async, everything looks fine. We have this connection link and then we have the ID. I just want to call it udfId so it's a little bit more clear. Can say UserDefinedFunction udf equals client.CreateUserDefinedFunctionQuery. If you remember, we did stored procedure query, and that's it, we're doing the same exact thing here.
And what gets passed is going to be the collection link, which we have right here. And then next, we want to use a where clause, similar to the stored proc, we're going to take that ID and then match it with udfId. We're trying to make sure that the UDF doesn't already exist, and if it does, then we need to grab it and delete it. And the next thing would be FirstOrDefault.
Now if this exists, then the UDF won't be null, and if that is the case, then we're going to try to delete it. As you can tell, that this API is very similar to what we just did for the stored proc. It's very predictable. The good part is, once you learn one API, you really don't have to learn another API, you can pretty much guess and get what you want. So what we've done is deleted a UDF if it existed prior to creating the new one.
So we need to create the UDF, and I'm going to say, CreateUserDefinedFunctionAsync and then pass in the collectionSelfLink and the UDF. And this UDF is the one we just created with the body and assigned it the same ID that got passed into this particular function. Next, we want the results after execution of this particular UDF, but since it's a UDF, it executes as part of something.
And it could be executed as part of a stored proc, as part of a query, and in this case, we'll just say, CreateDocumentQuery and we're going to say of type dynamic, and then pass it the self link, and now we can write the query. And this query could be pretty much anything. So let's look into our data to figure out what kind of query we can write. So, as you can see, we have name, and we can pick any of these names right here, so something like Yahoo, Cisco, I can copy that.
I could go say Select r.name and you could have given it any other name, I just chose r, comma udf.TotalFunding. You could have even called it doc or whatever, and then pass in r, which is really your document, and then call it as TotalFunding. So what we've done here is we've got the name and we're also executing a function on that particular document, and whatever value comes out of it is the total funding value.
And then we're saying from root r, and if we call this thing doc, then this alias would be doc, and this would be doc, too. So this is totally up to you, what name you want to give it, exactly like SQL. And then we can also have a where clause. In this case, I can say where name equals, and if you don't want to escape your double quotes, maybe you just use single quotes, and that should get the job done.
So now that we have this, we also might want to look into what other options we have, and one of the most important options that usually gets overlooked is feed options. And it's very important because, if we're doing a query that actually might span across partition queries, well, in this case we want to make sure that we EnableCrossPartitionQuery to be true, otherwise we will get an error. Let's assume we got the results. Once we have the results, all we need to do is iterate to the results and display them.
We can do something like Console.WriteLine and maybe say, the result is, and in this case, it's result, and this will be in, and that should work. I'm just going to put a break point right here and hit F5. Before I do that, we should right click the properties and make sure that it's Copy if newer.
Now since I copied this file, it automatically changes to Copy if newer, but if you manually add the file, make sure that this Copy to Output Directory is Copy if newer. Before we hit F5, we want to make sure that we call this function. So we GetTotalFunding and now we need the UDF IDs. In this case, if we know the UDF ID, rest will happen, and we need the database ID, which is Cazton, and then our collection ID, which is Test.
And I believe that's all it is, and we say Wait, and that should do the magic. And now we're going to press F5 and run this. As you can see, we got a result back, and the result says the name is Cisco and the total funding is whatever it is. And then say F5 again and we can come out of it. So as you saw there, it gave us a result back. You can come back and see what really happened. It actually went inside the company object, which was your document, and inside that document, we had funding rounds, and then inside the funding rounds, we had the raised amount, and all it did was add the raised amount to other amounts inside the same array for that particular record or that particular document.
- Data import scenarios
- Creating a database
- Creating a partitioned collection
- Data manipulation
- Importing documents with a stored procedure
- User-defined functions
- Excluding indexing at a document level
- Range indexing on strings
- Querying with SQL parameters
- Range operations