This project builds upon former multi-thread project, where I developed a simplified web server. In this project, the existing getfile server will be turned into a proxy server and implement a simple cache server. The project consists of two parts, focusing on inter-process communication (IPC) and shared memory usage.
Transform the existing getfile server into a proxy server that accepts GETFILE requests and translates them into HTTP requests.
Use of libcurl:
libcurl is a library for transferring data with URLs. It supports a wide range of protocols and provides easy-to-use interfaces. I implemented the handle_with_curl function to manage HTTP requests and responses. The function constructs a URL from a base and path, performs the request, and handles the response based on HTTP status codes.
Dynamic Memory Allocation:
Initially, the buffer was statically allocated, which limited the handling of larger data. I switched to dynamic memory allocation to efficiently manage memory usage and accommodate varying sizes of web content. Used realloc to adjust the buffer size dynamically as new data chunks arrive, ensuring data integrity and avoiding overflows.
Error Handling:
Proper error handling ensures robustness and reliability. By checking HTTP status codes, I differentiate between successful requests and different error states, such as file not found (404). I used curl_easy_getinfo to retrieve HTTP status codes and tailored responses to the client based on these codes.
#include
ssize_t handle_with_curl(gfcontext_t *ctx, const char *path, void* arg){
CURL *curl_handle;
CURLcode res;
struct MemoryStruct chunk;
char url[MAX_REQUEST_N]; // for base URL
// Initialize the MemoryStruct
chunk.memory = malloc(1); // Initial buffer allocation
chunk.size = 0;
// Initialize curl
curl_global_init(CURL_GLOBAL_ALL);
curl_handle = curl_easy_init();
if(!curl_handle) {
return SERVER_FAILURE;
}
// Construct the full URL wiht Path
snprintf(url, sizeof(url), "%s%s", (char *)arg, path);
// Set curl options
curl_easy_setopt(curl_handle, CURLOPT_URL, url);
curl_easy_setopt(curl_handle, CURLOPT_WRITEFUNCTION, WriteMemoryCallback);
curl_easy_setopt(curl_handle, CURLOPT_WRITEDATA, (void *)&chunk);
// Perform the request
res = curl_easy_perform(curl_handle);
if(res == CURLE_OK) {
long http_code = 0;
curl_easy_getinfo(curl_handle, CURLINFO_RESPONSE_CODE, &http_code);
if (http_code == 200) {
// Send header to client
gfs_sendheader(ctx, GF_OK, chunk.size);
// Send the content to the client
gfs_send(ctx, chunk.memory, chunk.size);
printf("files sent successfully");
curl_easy_cleanup(curl_handle);
free(chunk.memory);
return chunk.size;
} else if (http_code == 404) {
printf("file not found");
gfs_sendheader(ctx, GF_FILE_NOT_FOUND, 0);
} else {
printf("server error");
gfs_sendheader(ctx, GF_ERROR, 0);
}
}
else {
fprintf(stderr, "curl_easy_perform() failed: %s\n", curl_easy_strerror(res));
gfs_sendheader(ctx, GF_FILE_NOT_FOUND, 0);
}
// Cleanup
curl_easy_cleanup(curl_handle);
free(chunk.memory);
return SERVER_FAILURE;
}
The flow of control in the proxy server is as follows:
libcurl and preparing to handle incoming connections.libcurl to perform an HTTP GET request.curl_easy_perform.GF_OK.GF_FILE_NOT_FOUND.GF_ERROR.libcurl resources.shared_memory_t, which is utilized to store the status of shared memory, including the number of segments, segment sizes, the total size, and a pointer to another structure;shared_memory_segment_t, which is designated to hold the status related to segments, encompassing the total file size, effective data size, state, data storage fields, and three semaphores for control. The advantage of this approach lies in the ease of accessing the information of each segment and the overall properties of the shared memory by calculating the pointer positions.gfclient_download and gfclient_measure to ensure correct file retrieval and performance measurement.Implement a cache process that communicates with the proxy via shared memory. The shared memory functions are defined in shm_channel.[ch]
1. Share memory Structure:
shared_memory_t, which is utilized to store the status of shared memory, including the number of segments, segment sizes, the total size, and a pointer to another structure;shared_memory_segment_t, which is designated to hold the status related to segments, encompassing the total file size, effective data size, state, data storage fields, and three semaphores for control. The advantage of this approach lies in the ease of accessing the information of each segment and the overall properties of the shared memory by calculating the pointer positions.
typedef struct {
size_t data_size; // data size
size_t total_file_size; // total file size
int status;
sem_t full_semaphore;
sem_t empty_semaphore;
sem_t start_semaphore;
char data[];
} shared_memory_segment_t;
typedef struct {
int segment_count; // segment_count from webproxys
size_t segment_size; // size of one segment
size_t total_size; // total_size of the memory space
shared_memory_segment_t *slots; //first segments pointer
} shared_memory_t;
struct segment_msg {
long mtype; // type
int segment_id; // index for shm segment
char filepath[MAX_PATH_LENGTH]; // request file path
};
2. Shared Memory and Message Queues:
create_shared_memory function defined in shm_channel.ch, which sets the properties of shared_memory_t using the provided int segment_count and size_t segment_size. Since the size of the structures is known, the starting position of each shared_memory_segment_t can be calculated using an offset, allowing for the configuration of internal parameters and the initialization of semaphores.handle_with_cache is invoked. This function sends task information to the cached end via the message queue. The structure of the task information is defined as follows, encompassing the task type, the index of the corresponding segment, and the requested file path.
shared_memory_t* create_shared_memory(int segment_count, size_t segment_size) {
int shm_fd = shm_open(SHM_NAME, O_CREAT | O_RDWR, 0666);
if (shm_fd == -1) {
perror("Failed to open shared memory");
return NULL;
}
size_t total_size = sizeof(shared_memory_t) +
segment_count * (sizeof(shared_memory_segment_t) + segment_size);
if (ftruncate(shm_fd, total_size) == -1) {
perror("Failed to set shared memory size");
close(shm_fd);
return NULL;
}
shared_memory_t *shm_ptr = mmap(NULL, total_size, PROT_READ | PROT_WRITE, MAP_SHARED, shm_fd, 0);
if (shm_ptr == MAP_FAILED) {
perror("Failed to map shared memory");
close(shm_fd);
return NULL;
}
..................... // seting arguments
printf("Attempting mmap with size: %zu\n", total_size);
close(shm_fd);
return shm_ptr;
}
3. Boss-worker Threads and Tasks Steque
webproxy creates multiple threads and a task queue during initialization. This queue contains the index of each shared memory segment. The worker threads sequentially retrieve segments from the queue and, upon completion, reinsert the segment index at the end of the queue. This approach ensures that each segment is utilized, thereby maximizing efficiency in a multithreaded environment.boss thread places the received messages into the task queue, while the created child threads retrieve messages from the queue, extracting the segment ID and the requested file path. This enables both the proxy and cache to handle requests and transfer files concurrently in a multithreaded manner.4. Signal Handling:
start_semaphore is used by the cache end to signal the proxy end that the task has been received and the file transfer has commenced. The full_semaphore and empty_semaphore are employed to manage file read and write operations. Specifically, the initial value of empty_semaphore is set to 1. full_semaphore and start_semaphoreis set to 0.full_semaphore signal to indicate that reading can begin.
ssize_t write_to_shared_memory(shared_memory_t *shm_ptr, int slot_index, const char *data, size_t size) {
shared_memory_segment_t *slots = segment_pointer(shm_ptr, slot_index);
if (slot_index < 0 || slot_index >= shm_ptr->segment_count) {
fprintf(stderr, "Invalid slot index\n");
return -1;
}
if (size <= 0 || size > shm_ptr->segment_size) {
fprintf(stderr, "Data size exceeds segment size\n");
return -1;
}
sem_wait(&slots->empty_semaphore);
memcpy(slots->data, data, size);
slots->data_size = size;
sem_post(&slots->full_semaphore);
return size;
}
ssize_t read_from_shared_memory(shared_memory_t *shm_ptr, int slot_index, char *buffer, size_t buffer_size) {
shared_memory_segment_t *slots = segment_pointer(shm_ptr, slot_index);
if (slot_index < 0 || slot_index >= shm_ptr->segment_count) {
fprintf(stderr, "Invalid slot index\n");
return -1;
}
sem_wait(&slots->full_semaphore);
if (slots->data_size > buffer_size) {
fprintf(stderr, "Slot %d: data size: %zu > Buffer size: %zu\n", slot_index, slots->data_size, buffer_size);
sem_post(&slots->full_semaphore);
return -1;
}
memcpy(buffer, slots->data, slots->data_size);
size_t data_size = slots->data_size;
// printf("Slot %d: read file: %zu\n", slot_index, data_size);
sem_post(&slots->empty_semaphore);
return data_size;
}
Proxy Side
Cache Side
gfclient_download and gfclient_measure to ensure correct file retrieval and performance measurement. Specifically, after compiling, three terminals were opened to simulate the client, server, and cache. When the transmission failed, numerous print statements were added to both the server and cache ends to pinpoint the location of the failure. Once identified, gdb was used for debugging, setting breakpoints at the relevant locations to check the values of varibles.During development, the following resources were consulted:
