Using Libcurl to Download Files in C++ Synchronously, Asynchronously, and with Multiplexing

Applications are useless without their contents: text notes, images, audio and video recordings. But all these files don’t come out of nowhere. Your application must be able to download the data it needs from the internet.

In this article, we overview libcurl – a popular library for transferring files – and explain how to use libcurl in C++ solutions. We also describe three methods for downloading multiple files with libcurl in C++ applications – multiplexing, synchronous, and asynchronous downloading – and provide detailed code samples for each.

This article will be useful for C++ developers who are getting acquainted with network communication peculiarities and want to implement different file downloading options in their applications.

Contents:

What is libcurl?

Synchronous file downloading

Asynchronous file downloading

Multiplexing

Which method to choose?

Conclusion

What is libcurl?

There are many tools that simplify the work of C++ developers. One of them is a popular multiprotocol file transfer library called libcurl.

What is libcurl? Libcurl is part of the Client URL (cURL) project, which consists of two major components: a library with all the functions you might need for implementing file transfer features in your applications and a command-line tool called curl. As stated by its creators, the use cases of curl range from media players and mobile phones to television sets and even cars.

The best thing about this client-side URL transfer library is that it’s free and easy to use. Libcurl supports a wide selection of protocols (including HTTP and HTTPS), certificates, and file uploading and user authentication methods. Libcurl comes with the C API, and each function call to the library is part of this API.

Now that we’ve briefly overviewed the library itself, let’s move to the tasks you can accomplish with its help.

In this article, we focus on three methods for downloading files with libcurl in C++ apps:

Synchronous
Asynchronous
Multiplexing

To make it easier for you to recreate all the actions described in this article, we’ve used the examples provided by libcurl:

Synchronous: libcurl example – https.c
Asynchronous: libcurl example – multi-double.c
Multiplexing: libcurl example – http2-download.c
Other examples: libcurl – source code examples

Make sure to download these examples before getting started. Also, note that these examples are provided for a Visual Studio 2015 project that uses the latest versions of the curl-vc140-static-32_64 and openssl-vc140-static-32_64 NuGet packages.

Related services

Outsource Software Development in C/C++

Synchronous file downloading

We’ll start with a description of the synchronous file downloading method. With synchronous downloading, all requests are executed in a strict sequence. While this file transfer method is the easiest, it’s pretty inefficient and can’t be used when you need to transfer multiple files at once.

Let’s start by trying to download sample files from https.c.

1. First of all, we need to initialize global libcurl variables by calling the following function:

{code}curl_global_init(CURL_GLOBAL_DEFAULT);{/code}

When we’ve finished working with libcurl, we need to uninitialize its global variables by calling the curl_global_cleanup(); function.

Note: Libcurl calls the curl_global_init function at any call to the libcurl method as a fault safety measure. Therefore, in small programs, we can skip calls to curl_global_init and curl_global_cleanup.

We can also create a static variable that will call the curl_global_init(CURL_GLOBAL_DEFAULT); function in the constructor and the curl_global_cleanup(); function in the destructor. To do so, we’ll use the CurlGlobalStateGuard class:

{code}class CurlGlobalStateGuard{public: CurlGlobalStateGuard() { curl_global_init(CURL_GLOBAL_DEFAULT); } ~CurlGlobalStateGuard() { curl_global_cleanup(); }};static CurlGlobalStateGuard handle_curl_state;{/code}

2. To perform synchronous downloads, we need to use libcurl’s easy interface. We start with creating a curl easy handle for setting connection options and transferring data.

First, we create an easy handle by calling curl_easy_init(). When the target file is successfully downloaded, this handle should be freed by calling the curl_easy_cleanup(CURL*) function.

At this step, we need to apply a standard unique_ptr smart pointer:

{code}using EasyHandle = std::unique_ptr<CURL, std::function<void(CURL*)>>;EasyHandle CreateEasyHandle(){ auto curl = EasyHandle(curl_easy_init(), curl_easy_cleanup); if (!curl) { throw std::runtime_error("Failed creating CURL easy object"); } return curl;}{/code}

We can use the unique_ptr smart pointer instead of a class definition to keep the CURL pointer for us and call the curl_easy_cleanup function in the destructor.

3. In this example, we’ll download three files. Therefore, we need to create three handles:

{code}std::list<EasyHandle> handles(3);/* init easy stacks */try{ std::for_each(handles.begin(), handles.end(), [](auto& handle) {handle = CreateEasyHandle(); });}catch (const std::exception& ex){ std::cerr << ex.what() << std::endl; return -1;}{/code}

To simplify the use of multiple handles for similar actions, we can use a container (such as std::list) and the std::for_each function.

4. At this step, we set options for easy handles. To add links to the files that need to be downloaded, we set the CURLOPT_URL option:

{code}for (auto& handle : handles){ /* set options */ curl_easy_setopt(handle.get(), CURLOPT_URL, "https://raw.githubusercontent.com/curl/curl/master/docs/examples/https.c");}{/code}

5. Next, we need to deal with SSL connections. To simplify our tutorial, we can tell libcurl not to verify SSL connections by setting the CURLOPT_SSL_VERIFYPEER and CURLOPT_SSL_VERIFYHOST options to 0:

{code}void set_ssl(CURL* curl){ curl_easy_setopt(curl, CURLOPT_SSL_VERIFYPEER, 0L); curl_easy_setopt(curl, CURLOPT_SSL_VERIFYHOST, 0L);}/* ... */for (auto& handle : handles){ set_ssl(handle.get());}{/code}

Note: Nevertheless, curl provides multiple options for verifying server certificates. One such option is CURLOPT_SSLCERT.

6. By default, curl will print downloaded data to the console, which is useful for testing purposes. To change this behavior, we can redefine the CURLOPT_WRITEFUNCTION option and change it to write to a file:

{code}namespace{size_t write_to_file(void* contents, size_t size, size_t nmemb, void* userp){ size_t realsize = size * nmemb; auto file = reinterpret_cast<std::ofstream*>(userp); file->write(reinterpret_cast<const char*>(contents), realsize); return realsize;}}void save_to_file(CURL* curl){ static std::ofstream file("downloaded_data.txt", std::ios::binary); curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, write_to_file); curl_easy_setopt(curl, CURLOPT_WRITEDATA, reinterpret_cast<void*>(&file));}/* ... */for (auto& handle : handles){ set_ssl(handle.get());}{/code}

In this example, we open the downloaded_data.txt file in std::fstream and set its address to the CURLOPT_WRITEDATA option. Libcurl will pass this option to the write_to_file function. Then we set the CURLOPT_WRITEFUNCTION option for the write_to_file function to write incoming data to a file.

These options should be applied to all curl easy handles we use.

7. At this point, we can call curl_easy_perform to execute our file download requests.

{code}for (auto& handle : handles){ /* Perform the request, res will get the return code */ auto res = curl_easy_perform(handle.get()); /* Check for errors */ if (res != CURLE_OK) { std::cerr << "curl_easy_perform() failed:" << curl_easy_strerror(res) << std::endl; return -1; }}{/code}

The curl_easy_perform function doesn’t return until the request is finished, either successfully or unsuccessfully. Once the request is executed, libcurl will check everything for errors and output the error string to the std::cerr object.

Here’s the full code for executing synchronous file downloading with libcurl:

{code}int download_synchronous(void){ std::list<easyhandle> handles(3); /* init easy stacks */ try { std::for_each(handles.begin(), handles.end(), [](auto& handle) {handle = CreateEasyHandle(); }); } catch (const std::exception& ex) { std::cerr << ex.what() << std::endl; return -1; } for (auto& handle : handles) { /* set options */ curl_easy_setopt(handle.get(), CURLOPT_URL, "https://raw.githubusercontent.com/curl/curl/master/docs/examples/https.c"); set_ssl(handle.get()); save_to_file(handle.get()); /* Perform the request, res will get the return code */ auto res = curl_easy_perform(handle.get()); /* Check for errors */ if (res != CURLE_OK) { std::cerr << "curl_easy_perform() failed:" << curl_easy_strerror(res) << std::endl; return -1; } } return 0;}{/code}

Next, we move to a step-by-step description of the asynchronous file downloading method.

Read also:
How to Handle Legacy Code: A Detailed Guide Based on a Real-Life Example

Asynchronous file downloading

In asynchronous file downloading, all requests are executed in an arbitrary order. This method is considered more efficient than synchronous downloading, but it’s also more difficult and less comfortable to use.

For asynchronous downloading, we still need to create curl easy handles and set options for them. So the first seven steps will be the same as for synchronous downloading.

However, asynchronous downloading in libcurl can be achieved through the curl-multi interface. With a multi handle and the curl-multi interface, you can perform several simultaneous transfers in parallel. In this case, every separate transfer will be built around a separate curl easy handle.

Let’s see how it works.

1. First, we create a curl multi handle using the curl_multi_init function. Then we need to use the curl_multi_* function to set options for this handle, add curl easy handles to the multi handle, and perform downloads. Calling the curl_multi_cleanup function will free the curl multi handle.

Similar to with the curl easy handle, we can do all this with help of unique_ptr:

{code}using MultiHandle = std::unique_ptr<CURLM, std::function<void(CURLM*)>>;MultiHandle CreateMultiHandle(){ auto curl = MultiHandle(curl_multi_init(), curl_multi_cleanup); if (!curl) { throw std::runtime_error("Failed creating CURL multi object"); } return curl;}{/code}

Once again, we’ve created a guard for the CURLM pointer by calling curl_multi_init(). Upon destruction of the unique_ptr smart pointer, this guard will be passed to the curl_multi_cleanup(CURLM*) function.

2. Using the created guard, we can now create a curl multi handle. Similar to the previous example, we create three curl easy handles for downloading three sample files:

{code}std::list<easyhandle> handles(3);MultiHandle multi_handle;/* init easy and multi stacks */try{ multi_handle = CreateMultiHandle(); std::for_each(handles.begin(), handles.end(), [](auto& handle){handle = CreateEasyHandle(); });}catch (const std::exception& ex){ std::cerr << ex.what() << std::endl; return -1;}{/code}

3. We don’t need to set up any other options for the curl multi handler and can now move to adding curl easy handles to our curl multi handle.

{code}/* set options */std::for_each(handles.begin(), handles.end(), [](auto& handle) { curl_easy_setopt(handle.get(), CURLOPT_URL, "https://raw.githubusercontent.com/curl/curl/master/docs/examples/multi-double.c"); set_ssl(handle.get()); save_to_file(handle.get());});/* add the individual transfers */std::for_each(handles.begin(), handles.end(), [&multi_handle](auto& handle) {curl_multi_add_handle(multi_handle.get(), handle.get()); });{/code}

4. Next, we perform requests by implementing a sequence of calls to the curl_multi_perform function and a loop of the curl_multi_timeout, curl_multi_fdset, select, and curl_multi_perform functions. The curl_multi_perform function should return a non-zero count of running handles.

In our example, we put this routine into a single dedicated multi_loop function. Let’s see what’s going on inside this loop.

4.1 The first call is to the curl_multi_perform function. During this call, if there’s any operation ready on the socket, whether read or write, it will be executed here. Also, this call assigns the count of currently running curl easy handles to the second argument. If this value is equal to 0, it means all operations have finished.

4.2 If the previous value is positive, it means there are still some curl easy handles running and we should wait for the suggested timeout.

{code}int still_running = 0; /* keep number of running handles *//* we start some action by calling perform right away */curl_multi_perform(multi_handle, &still_running);while (still_running) { /*...*/}{/code}

To find out the suggested timeout duration, we need to call the curl_multi_timeout function. It assigns to the second argument the value in milliseconds that we should wait before the next call to curl_multi_perform. If this value is 0, it means we should call curl_multi_perform without waiting. If the value is negative, libcurl has no timeout set and we should wait, say, 100 milliseconds.

{code}timeval get_timeout(CURLM* multi_handle){ long curl_timeo = -1; /* set a suitable timeout to play around with */ struct timeval timeout; timeout.tv_sec = 1; timeout.tv_usec = 0; curl_multi_timeout(multi_handle, &curl_timeo); if (curl_timeo >= 0) { timeout.tv_sec = curl_timeo / 1000; if (timeout.tv_sec > 1) timeout.tv_sec = 1; else timeout.tv_usec = (curl_timeo % 1000) * 1000; } return timeout;}/*...*/while (still_running) { struct timeval timeout = get_timeout(multi_handle); /*...*/}{/code}

4.3 The next step is to wait for actions on sockets. To do so, we need to extract file descriptor information for the fd file from the curl multi handle using the curl_multi_fdset function. As this function only adds file descriptors to input parameters, before calling it, we need to zero out all fd_set variables using FD_ZERO.

The maxfd parameter is set to the value of the maximum file descriptor. If this value is -1, we need to wait 100 milliseconds according to the curl_multi_fdset documentation. If this value isn’t equal to -1, we can use the acquired file descriptors in the select function call. It uses maxfd and all fd_sets and timeout values to wait for some actions on the file descriptors. The function call returns the number of sockets that are ready for read/write operations.

{code}int wait_if_needed(CURLM* multi_handle, timeval& timeout){ fd_set fdread; fd_set fdwrite; fd_set fdexcep; FD_ZERO(&fdread); FD_ZERO(&fdwrite); FD_ZERO(&fdexcep); int maxfd = -1; /* get file descriptors from the transfers */ auto mc = curl_multi_fdset(multi_handle, &fdread, &fdwrite, &fdexcep, &maxfd); if (mc != CURLM_OK) { std::cerr << "curl_multi_fdset() failed, code " << mc << "." << std::endl; } /* On success the value of maxfd is guaranteed to be >= -1. We call sleep for 100ms, which is the minimum suggested value in the curl_multi_fdset() doc. */ if (maxfd == -1) { std::this_thread::sleep_for(std::chrono::milliseconds(100)); } int rc = maxfd != -1 ? select(maxfd + 1, &fdread, &fdwrite, &fdexcep, &timeout) : 0; return rc;}/*...*/while (still_running) { /*...*/ auto rc = wait_if_needed(multi_handle, timeout); /*...*/}{/code}

4.4 Next, we call the curl_multi_perform function to handle actions on sockets.

{code}while (still_running) { /*...*/ auto rc = wait_if_needed(multi_handle, timeout); if (rc >= 0) { /* timeout or readable/writable sockets */ curl_multi_perform(multi_handle, &still_running); }}{/code}

During execution of this loop, all curl easy handles will receive data and write it to a file. This approach is especially useful when you don’t want to deal with threads.

You can find the code for the whole loop here.

5. Finally, after executing all requests, we need to remove curl easy handles from the multi handle:

{code}multi_loop(multi_handle.get());std::for_each(handles.begin(), handles.end(), [&multi_handle](auto& handle) {curl_multi_remove_handle(multi_handle.get(), handle.get()); });{/code}

Here’s the full code for asynchronous file downloading:

{code}int download_asynchronous(void){ std::list<easyhandle> handles(3); MultiHandle multi_handle; /* init easy and multi stacks */ try { multi_handle = CreateMultiHandle(); std::for_each(handles.begin(), handles.end(), [](auto& handle){handle = CreateEasyHandle(); }); } catch (const std::exception& ex) { std::cerr << ex.what() << std::endl; return -1; } /* set options */ std::for_each(handles.begin(), handles.end(), [](auto& handle) { curl_easy_setopt(handle.get(), CURLOPT_URL, "https://raw.githubusercontent.com/curl/curl/master/docs/examples/multi-double.c"); set_ssl(handle.get()); save_to_file(handle.get()); }); /* add the individual transfers */ std::for_each(handles.begin(), handles.end(), [&multi_handle](auto& handle) {curl_multi_add_handle(multi_handle.get(), handle.get()); }); multi_loop(multi_handle.get()); std::for_each(handles.begin(), handles.end(), [&multi_handle](auto& handle) {curl_multi_remove_handle(multi_handle.get(), handle.get()); }); return 0;}{/code}

Now it’s time to move to the third method: multiplexing.

Read also:
How to Build a Mobile Device Management (MDM) System?

Multiplexing

During multiplexing, multiple requests are executed via a single Transmission Control Protocol (TCP) connection. This method was introduced as part of the HTTP/2 protocol.

Multiplexing allows for reusing a single connection to one server for processing multiple HTTP requests. This method can improve the performance of request-heavy applications by eliminating the need to close and reopen server connections.

To use multiplexing in libcurl, we need to configure it to use the HTTP/2 protocol.

1. libcurl provides support for request multiplexing via the curl-multi interface. We can reuse the asynchronous example and add this code between setting options for curl easy handles and add curl easy handles to the curl multi handle (step 3 in the previous section):

{code}/*...*/for(auto& handle : handles){ /* HTTP/2 please */ curl_easy_setopt(handle.get(), CURLOPT_HTTP_VERSION, CURL_HTTP_VERSION_2_0); /* wait for pipe connection to confirm */ curl_easy_setopt(handle.get(), CURLOPT_PIPEWAIT, 1L);}curl_multi_setopt(multi_handle.get(), CURLMOPT_PIPELINING, CURLPIPE_MULTIPLEX);{/code}

The CURLOPT_HTTP_VERSION option with the CURL_HTTP_VERSION_2_0 value asks libcurl to use the HTTP/2 protocol. If a server doesn’t support HTTP/2, the protocol is downgraded to HTTP/1.1.

The CURLOPT_PIPEWAIT option in value 1 asks libcurl to check if there’s any connection that enables pipelining or multiplexing. If detected, such a connection will be used instead of creating a new connection.

The CURLMOPT_PIPELINING option in the CURLPIPE_MULTIPLEX value will try to execute transfers over existing connections.

Here’s a full example of a multiplexing method implementation in libcurl:

{code}/* * Download many transfers over HTTP/2 using the same connection! */int download_multiplexing(void){ std::list<easyhandle> handles(3); MultiHandle multi_handle; /* init easy and multi stacks */ try { multi_handle = CreateMultiHandle(); for(auto& handle : handles) { handle = CreateEasyHandle(); /* HTTP/2 please */ curl_easy_setopt(handle.get(), CURLOPT_HTTP_VERSION, CURL_HTTP_VERSION_2_0); /* wait for pipe connection to be confirmed */ curl_easy_setopt(handle.get(), CURLOPT_PIPEWAIT, 1L); } } catch (const std::exception& ex) { std::cerr << ex.what() << std::endl; return -1; } for (auto& handle : handles) { curl_easy_setopt(handle.get(), CURLOPT_URL, "https://raw.githubusercontent.com/curl/curl/master/docs/examples/http2-download.c"); set_ssl(handle.get()); save_to_file(handle.get()); /* add the individual transfers */ curl_multi_add_handle(multi_handle.get(), handle.get()); } curl_multi_setopt(multi_handle.get(), CURLMOPT_PIPELINING, CURLPIPE_MULTIPLEX); multi_loop(multi_handle.get()); std::for_each(handles.begin(), handles.end(), [&multi_handle](auto& handle) {curl_multi_remove_handle(multi_handle.get(), handle.get()); }); return 0;}{/code}

Which method should you choose?

Let’s summarize the pros and cons of each libcurl file downloading method:

You can find the full source code of all examples from this article on our GitHub page.

Related services

Kernel and Driver Development

Conclusion

Downloading files is one of the basic activities any application should be able to perform. Developers can enable a C++ solution to download a file with curl, a popular file transfer library.

The three most widely used file downloading methods are multiplexing, synchronous, and asynchronous. Each has its advantages and drawbacks and is best used in specific cases.

At Apriorit, we have a team of passionate Linux, macOS, and C++ developers who create performant, stable, and secure applications of any complexity. Get in touch with us and we’ll start discussing your next project right away.

Using Libcurl to Download Files in C++ Synchronously, Asynchronously, and with Multiplexing | Apriorit (2024)

What is libcurl?

Synchronous file downloading

Asynchronous file downloading

Multiplexing

Which method should you choose?

Conclusion