Calling remote procedures (rpc - Remote Procedure Call). Remote Procedures: Remote Procedure Call, Definition and Features RPC Execution Steps

Lecture 4

4.1 Remote Procedure Call Concept

The idea of ​​calling remote procedures (Remote Procedure Call - RPC) consists of extending the well-known and understood mechanism for transferring control and data within a program running on one machine to transferring control and data over a network. Remote procedure call tools are designed to facilitate the organization of distributed computing. The greatest efficiency of using RPC is achieved in those applications in which there is interactive communication between remote components with fast response times and a relatively small amount of data transferred. Such applications are called RPC-oriented.

The characteristic features of calling local procedures are: asymmetry, that is, one of the interacting parties is the initiator; synchronicity, that is, execution of the calling procedure stops from the moment the request is issued and is resumed only after the called procedure returns.

Implementing remote calls is much more complicated than implementing local procedure calls. To begin with, since the calling and called procedures are executed on different machines, they have different address spaces, and this creates problems when passing parameters and results, especially if the machines are not identical. Since RPC cannot rely on shared memory, this means that RPC parameters must not contain pointers to non-stack memory locations and that parameter values ​​must be copied from one computer to another. The next difference between RPC and a local call is that it necessarily uses the underlying communication system, but this should not be explicitly visible either in the definition of the procedures or in the procedures themselves. Remoteness introduces additional problems. The execution of the calling program and the called local procedure on the same machine is implemented within a single process. But the implementation of RPC involves at least two processes - one on each machine. If one of them crashes, the following situations may arise: if the calling procedure crashes, the remotely called procedures will become “orphaned”, and if the remote procedures crash, the calling procedures will become “orphaned parents”, waiting in vain for a response from the remote procedures.

In addition, there are a number of problems associated with the heterogeneity of programming languages ​​and operating environments: the data structures and procedure call structures supported in any one programming language are not supported in the same way in all other languages.


These and some other problems are solved by the widespread RPC technology, which underlies many distributed operating systems.

Basic RPC Operations

To understand how RPC works, let's first consider making a local procedure call on a typical machine running offline. Let this be, for example, a system call

count=read(fd,buf,nbytes);

where fd is an integer;

buf – array of characters;

nbytes is an integer.

To make the call, the calling procedure pushes the parameters onto the stack in reverse order. After the read call is executed, it places the return value into a register, moves the return address, and returns control to the calling procedure, which pops parameters from the stack, returning it to its original state. Note that in the C language, parameters can be called either by reference (by name) or by value (by value). In relation to the called procedure, value parameters are initialized local variables. The called procedure can change them without affecting the original values ​​of these variables in the calling procedure.

If a pointer to a variable is passed to the called procedure, then changing the value of this variable by the called procedure entails changing the value of this variable for the calling procedure. This fact is very significant for RPC.

There is also another mechanism for passing parameters that is not used in C. It is called call-by-copy/restore, which requires the caller to copy variables onto the stack as values, and then copy them back after the call is made over the original values ​​of the calling procedure.

The decision about which parameter passing mechanism to use is made by the language developers. Sometimes it depends on the type of data being transferred. In C, for example, integers and other scalar data are always passed by value, and arrays are always passed by reference.

The idea behind RPC is to make a remote procedure call look as similar as possible to a local procedure call. In other words, make RPC transparent: the calling procedure does not need to know that the called procedure is on another machine, and vice versa.

RPC achieves transparency in the following way. When the called procedure is actually remote, another version of the procedure, called a client stub, is placed in the library instead of the local procedure. Like the original procedure, the stub is called using a calling sequence, and an interrupt occurs when accessing the kernel. Only, unlike the original procedure, it does not place parameters in registers and does not request data from the kernel; instead, it generates a message to be sent to the kernel of the remote machine.

RPC Execution Stages

The interaction of software components when performing a remote procedure call is illustrated in Figure 2.

Figure 2. Remote Procedure Call

After the client stub has been called by the client program, its first task is to fill the buffer with the message being sent. In some systems, the client stub has a single fixed-length buffer that is filled from the very beginning with each new request. In other systems, the message buffer is a pool of buffers for individual message fields, some of which are already full. This method is especially suitable for cases where the packet has a format consisting of a large number of fields, but the values ​​of many of these fields do not change from call to call.

The parameters must then be converted to the appropriate format and inserted into the message buffer. At this point, the message is ready to be sent, so the kernel call interrupt is executed.

When the kernel gains control, it switches contexts, saves processor registers and memory map (page handles), and installs a new memory map that will be used to run in kernel mode. Because the kernel and user contexts are different, the kernel must copy the message exactly into its own address space so that it can access it, remember the destination address (and possibly other header fields), and it must pass it to the network interface. This completes the work on the client side. The transmission timer is turned on, and the kernel can either cyclically poll for a response or pass control to the scheduler, which will select some other process to run. In the first case, query execution is accelerated, but multiprogramming is absent.

On the server side, incoming bits are placed by the receiving hardware either in an on-chip buffer or in RAM. When all information has been received, an interrupt is generated. The interrupt handler checks the correctness of the packet data and determines which stub it should be sent to. If none of the stubs are expecting this packet, the handler must either buffer it or discard it altogether. If there is a waiting stub, the message is copied to it. Finally, a context switch is performed, as a result of which the registers and memory map are restored, taking the values ​​that they had at the moment when the stub made the receive call.

Now the server stub starts working. It unpacks the parameters and pushes them appropriately onto the stack. When everything is ready, a call to the server is made. After executing the procedure, the server transmits the results to the client. To do this, perform all the steps described above, only in reverse order.

Figure 3 shows the sequence of commands that must be executed for each RPC call.

Figure 3. RPC procedure steps

) Remote Procedure Call Concept

The idea of ​​a Remote Procedure Call (RPC) is to extend the well-known and understood mechanism for transferring control and data within a program running on a single machine to transfer control and data over a network. Remote procedure call tools are designed to facilitate the organization of distributed computing. The greatest efficiency of using RPC is achieved in those applications in which there is interactive communication between remote components with fast response times and a relatively small amount of data transferred. Such applications are called RPC-oriented.

The characteristic features of calling local procedures are:

Asymmetry, that is, one of the interacting parties is the initiator; Synchronicity, that is, execution of the calling procedure stops from the moment the request is issued and is resumed only after the called procedure returns.

Implementing remote calls is much more complicated than implementing local procedure calls. To begin with, since the calling and called procedures are executed on different machines, they have different address spaces, and this creates problems when passing parameters and results, especially if the machines are not identical. Since RPC cannot rely on shared memory, this means that RPC parameters must not contain pointers to non-stack memory locations and that parameter values ​​must be copied from one computer to another. The next difference between RPC and a local call is that it necessarily uses the underlying communication system, but this should not be explicitly visible either in the definition of the procedures or in the procedures themselves. Remoteness introduces additional problems. The execution of the calling program and the called local procedure on the same machine is implemented within a single process. But the implementation of RPC involves at least two processes - one on each machine. If one of them crashes, the following situations may arise: if the calling procedure crashes, the remotely called procedures will become “orphaned”, and if the remote procedures crash, the calling procedures will become “orphaned parents”, waiting in vain for a response from the remote procedures.

In addition, there are a number of problems associated with the heterogeneity of programming languages ​​and operating environments: the data structures and procedure call structures supported in any one programming language are not supported in the same way in all other languages.

These and some other problems are solved by the widespread RPC technology, which underlies many distributed operating systems.

Basic RPC Operations

To understand how RPC works, let's first consider making a local procedure call on a typical machine running offline. Let this be, for example, a system call

Count=read(fd,buf,nbytes);

where fd is an integer,
buf - array of characters,
nbytes is an integer.

To make the call, the calling procedure pushes the parameters onto the stack in reverse order (Figure 3.1). After the read call is executed, it places the return value into a register, moves the return address, and returns control to the calling procedure, which pops parameters from the stack, returning it to its original state. Note that in the C language, parameters can be called either by reference (by name) or by value (by value). In relation to the called procedure, value parameters are initialized local variables. The called procedure can change them without affecting the original values ​​of these variables in the calling procedure.

If a pointer to a variable is passed to the called procedure, then changing the value of this variable by the called procedure entails changing the value of this variable for the calling procedure. This fact is very significant for RPC.

There is also another parameter passing mechanism that is not used in C. It is called call-by-copy/restore, which requires the caller to copy variables onto the stack as values, and then copy them back after the call is made over the original values ​​of the calling procedure.

The decision about which parameter passing mechanism to use is made by the language developers. Sometimes it depends on the type of data being transferred. In C, for example, integers and other scalar data are always passed by value, and arrays are always passed by reference.

Rice. 3.1. a) The stack before the read call is executed;
b) Stack during procedure execution;
c) Stack after returning to the calling program

The idea behind RPC is to make a remote procedure call look as similar as possible to a local procedure call. In other words, make RPC transparent: the calling procedure does not need to know that the called procedure is on another machine, and vice versa.

RPC achieves transparency in the following way. When the called procedure is actually remote, another version of the procedure, called a client stub, is placed in the library instead of the local procedure. Like the original procedure, the stub is called using a calling sequence (as in Figure 3.1), and an interrupt occurs when accessing the kernel. Only, unlike the original procedure, it does not place parameters in registers and does not request data from the kernel; instead, it generates a message to be sent to the kernel of the remote machine.

RPC Execution Stages

The interaction of software components when performing a remote procedure call is illustrated in Figure 3.2. After the client stub has been called by the client program, its first task is to fill the buffer with the message being sent. In some systems, the client stub has a single fixed-length buffer that is filled from the very beginning with each new request. In other systems, the message buffer is a pool of buffers for individual message fields, some of which are already full. This method is especially suitable for cases where the packet has a format consisting of a large number of fields, but the values ​​of many of these fields do not change from call to call.

The parameters must then be converted to the appropriate format and inserted into the message buffer. At this point, the message is ready to be sent, so the kernel call interrupt is executed.

Rice. 3.2. Remote Procedure Call

When the kernel gains control, it switches contexts, saves processor registers and memory map (page handles), and installs a new memory map that will be used to run in kernel mode. Because the kernel and user contexts are different, the kernel must copy the message exactly into its own address space so that it can access it, remember the destination address (and possibly other header fields), and it must pass it to the network interface. This completes the work on the client side. The transmission timer is turned on, and the kernel can either cyclically poll for a response or pass control to the scheduler, which will select some other process to run. In the first case, query execution is accelerated, but multiprogramming is absent.

On the server side, incoming bits are placed by the receiving hardware either in an on-chip buffer or in RAM. When all information has been received, an interrupt is generated. The interrupt handler checks the correctness of the packet data and determines which stub it should be sent to. If none of the stubs are expecting this packet, the handler must either buffer it or discard it altogether. If there is a waiting stub, the message is copied to it. Finally, a context switch is performed, as a result of which the registers and memory map are restored, taking the values ​​that they had at the moment when the stub made the receive call.

Now the server stub starts working. It unpacks the parameters and pushes them appropriately onto the stack. When everything is ready, a call to the server is made. After executing the procedure, the server transmits the results to the client. To do this, perform all the steps described above, only in reverse order.

Figure 3.3 shows the sequence of commands that must be executed for each RPC call, and Figure 3.4 shows what proportion of the total RPC execution time is spent on each of the 14 steps described. The tests were conducted on a DEC Firefly multi-processor workstation, and while the presence of five processors necessarily affected the results of the measurements, the histogram shown in the figure gives a general idea of ​​​​the RPC execution process.

Rice. 3.3. Steps to perform an RPC procedure

Rice. 3.4. Time distribution between 14 stages of RPC execution

1. Calling a stub

2. Prepare a buffer

3. Pack parameters

4. Fill in the title field

5. Calculate the checksum in the message

6. Interrupt to the kernel

7. Package queue for execution

8. Transmitting a message to the controller via the QBUS bus

9. Ethernet transmission time

10. Receive packet from controller

11. Interrupt handling procedure

12. Checksum calculation

13. Context switching to user space

14. Performing a server stub

Dynamic Linking

Let's consider how the client specifies the location of the server. One method to solve this problem is to directly use the server's network address in the client program. The disadvantage of this approach is its extreme inflexibility: when moving a server, or increasing the number of servers, or changing the interface in all these and many other cases, it is necessary to recompile all programs that used a hard-coded server address. To avoid all these problems, some distributed systems use what is called dynamic linking.

The starting point for dynamic binding is the formal definition (specification) of the server. The specification contains the file server name, version number and a list of service procedures provided by this server to clients (Figure 3.5). For each procedure, a description of its parameters is given, indicating whether this parameter is input or output relative to the server. Some parameters can be both input and output - for example, some array that is sent by the client to the server, modified there, and then returned back to the client (copy/restore operation).

Rice. 3.5. RPC Server Specification

The formal server specification is used as input to the stub generator program, which creates both client and server stubs. They are then placed in the appropriate libraries. When a user (client) program calls any procedure defined in the server specification, the corresponding stub procedure is associated with the program binary code. Likewise, when a server is compiled, server stubs are associated with it.

When a server starts up, the very first thing it does is pass its server interface to a special program called a binder. This process, known as the server registration process, involves the server passing its name, version number, unique identifier, and handle to the location of the server. The handle is system independent and can be an IP, Ethernet, X.500, or some other address, and may also contain other information, such as authentication-related information.

When a client calls one of the remote procedures for the first time, for example, read, the client stub sees that it is not yet connected to the server and sends a message to the binder program with a request to import the interface of the desired version of the desired server. If such a server exists, then binder passes the descriptor and unique identifier to the client stub.

When sending a message with a request, the client stub uses a descriptor as an address. The message contains parameters and a unique identifier that the server core uses to route the incoming message to the desired server if there are several of them on this machine.

This method of importing/exporting interfaces is highly flexible. For example, there may be multiple servers supporting the same interface, and clients are randomly distributed across the servers. Within the framework of this method, it becomes possible to periodically poll servers, analyze their performance and, in case of failure, automatically shut down, which increases the overall fault tolerance of the system. This method can also support client authentication. For example, the server may determine that it can only be used by clients from a specific list.

However, dynamic binding has disadvantages, such as additional overhead (time) for exporting and importing interfaces. The magnitude of these costs can be significant, since many client processes exist for a short time, and each time the process starts, the interface import procedure must be performed again. In addition, in large distributed systems, the binder program can become a bottleneck, and creating several programs with a similar purpose also increases the overhead of creating and synchronizing processes.

RPC semantics in case of failures

Ideally, RPC should function correctly even in the event of failures. Consider the following failure classes:

The client cannot locate the server, for example, if the desired server fails, or because the client program was compiled a long time ago and used an old version of the server interface. In this case, in response to the client's request, a message containing an error code is received. The request from the client to the server was lost. The simplest solution is to repeat the request after a certain time. The response message from the server to the client was lost. This option is more complicated than the previous one, since some procedures are not idempotent. An idempotent procedure is a procedure whose execution request can be repeated several times without changing the result. An example of such a procedure would be reading a file. But the procedure for withdrawing a certain amount from a bank account is not idempotent, and if the response is lost, a repeated request can significantly change the state of the client’s account. One possible solution is to make all procedures idempotent. However, in practice this is not always possible, so another method can be used - sequential numbering of all requests by the client kernel. The server core remembers the number of the most recent request from each client, and upon receiving each request, it analyzes whether this request is a primary or a repeated one. The server crashed after receiving the request. The property of idempotency is also important here, but unfortunately the approach with query numbering cannot be applied. In this case it matters

    Java RMI as a type of remote procedure call, independent of the network, the main steps of working with it and its purpose. Comparison of distributed and non-distributed Java programs. Architecture, stub and skeleton, remote reference and transport layers of Java RMI.

    Pre-compilation of SQL queries at the execution site. Using the prepareStatement statement. Use call definition syntax to obtain the return value of a procedure or function. Creating instructions for sampling on request.

    Purpose and scheme of work. Composition and installation. http package procedure specification.

    Procedures and functions can be defined as closed program units that implement a certain algorithm. In fact, a procedure or function is almost a program.

    Automated TCP/IP configuration, dynamic configuration using BOOTP. Request/response IP addresses, message loss and format, BOOTP phases. The DHCP protocol is an extension of the BOOTP protocol. Distribution and assignment of IP addresses.

    We have already encountered the concept of recursion: recurrence relations are quite often found in mathematical expressions. Recursion in definition consists in the fact that the concept being defined is defined through this concept itself.

    1. Introduction 2 2. Overview of COM technology 2 2.1. Composition of a COM object 3 2.2. Interfaces 4 2.3. Properties of COM objects 6 2.4. COM servers 6 2.5. Marshalling mechanism 7

    Study of the essence, operating principle and main purpose of remote databases. Remote data management model (file server model). Types of parallelism. A trigger is a mechanism for tracking special events that are associated with the state of the database.

    Metamodel, fact and security packages. Conceptual model of the client. An example of the functioning of a distributed architecture. Complexity of implementation.

    Dll concept. Let's remember the programming process in DOS. Converting source text into machine code involved two processes: compilation and linking. During linking, not only declarations of functions and procedures, but also their full code were placed in the program code.

    Functions for working with the TCP/IP protocol, Socket, Bind, listen and accept. File descriptor. Communication processes. Receiving data. Reading from a socket. Write to a socket. Closing the socket. The text of the program that creates a Web server in the QNX operating system.

    Access of network users to electronic messages stored on the server. Description of the program, simple authentication, APOP and AUTH authentication. Implementation of functions, user manual, program operation algorithms, graphical interface.

    The principle of operation of the main operators of the Turbo-Paskal programming language: assignment operator, Case selection, unconditional jump, loop, catch, compound. Formal description and call of a function and procedure. Requirements for the list of actual parameters.

    The operating principle and purpose of Java servlets, their importance in increasing the functionality of Web servers and improving their programming, advantages and disadvantages of use. Ways to call servlets from the browser and page. Write and read session attributes.

    Architecture of Windows NT OS. OS structure based on microkernel. Protected subsystems of Windows NT.

    Basic message passing primitives in distributed systems. Addressing methods. Blocking and non-blocking primitives. Buffered and non-buffered primitives.

    Applications server. Client part.

    Two years ago, AJAX was a novelty (and the word AJAX itself had not yet been invented). Now web applications whose pages are updated on the fly are the order of the day. On the contrary: it’s hard to imagine some services without AJAX.

    Syntax for describing and calling a procedure. Options. An example of a procedure description and call. Types of parameters. Program.

A very important mechanism for client-server applications is provided by RPC ( Remote Procedure Call). RPC was developed by Sun Micrsystems and is a collection of tools and library functions. In particular, NIS (Network Information System) and NFS (Network File System) work on RPC.

An RPC server consists of a system of such procedures that a client can access by sending an RPC request to the server along with the procedure parameters. The server will call the designated procedure and return the procedure's return value, if any. To be machine-independent, all data exchanged between client and server is converted to a so-called external data representation ( External Data Representation, XDR). RPC communicates with UDP and TCP sockets to transfer data in XDR format. Sun has declared RPC as a public domain, and its description is available in a series of RFC documents.

Sometimes changes in RPC applications introduce incompatibility into the interface call procedure. Of course, a simple change would cause the server to crash any applications that are still waiting for the same calls. Therefore, RPC programs have version numbers assigned to them, usually starting with 1. Each new version of RPC keeps track of the version number. Often the server may offer several versions at the same time. Clients in this case specify the version number they want to use.

The network communication between RPC servers and clients is a little special. An RPC server offers one or more system procedures, each set of such procedures is called a program ( program) and is uniquely identified by the program number ( program number). A list of service names is usually kept in /etc/rpc, an example of which is given below.

Example 12-4. Sample /etc/rpc file

# # /etc/rpc - miscellaneous RPC-based services # portmapper 100000 portmap sunrpc rstatd 100001 rstat rstat_svc rup perfmeter rusersd 100002 rusers nfs 100003 nfsprog ypserv 100004 ypprog mountd 100005 mount showmount ypbind 100007 walld 100008 rwall shutdown yppasswdd 100009 yppasswd bootparam 100026 ypupdated 100028 ypupdate

In TCP/IP networks, the authors of RPC were faced with the task of mapping program numbers to common network services. Each server provides a TCP and UDP port for each program and each version. In general, RPC applications use UDP to transmit data and fall back to TCP when the data to be transmitted does not fit into a single UDP datagram.

Of course, client programs must have a way to figure out which port corresponds to the program number. Using a configuration file for this would be too inflexible; Since RPC applications do not use reserved ports, there is no guarantee that the port is not occupied by some application and is available to us. Hence, RPC applications choose any port they can receive and register it with portmapper daemon. A client that wants to contact a service with a given program number will first make a request to portmapper to find out the port number of the desired service.

This method has the disadvantage that it introduces a single point of failure, much like inetd daemon However, this case is a little worse because when the portmapper fails, all RPC information about the ports is lost. This usually means that you must restart all RPC servers manually or reboot the machine.

On Linux, the portmapper is called /sbin/portmap or /usr/sbin/rpc.portmap . Apart from the fact that it must be launched from the network startup script, portmapper does not require any configuration work.

Remote Procedure Call RPC The concept of Remote Procedure Call The idea behind Remote Procedure Call (RPC) is to extend the well-known and understood mechanism for transferring control and data within a program running on one machine to transfer control and data over a network. Remote procedure call tools are designed to facilitate the organization of distributed computing. The greatest efficiency of using RPC is achieved in those applications in which there is interactive communication between remote components with short response times and a relatively small amount of transferred data.

Such applications are called RPC-oriented. The characteristic features of calling local procedures are Asymmetry, that is, one of the interacting parties is the initiator Synchronicity, that is, the execution of the calling procedure stops from the moment the request is issued and resumes only after returning from the called procedure. The implementation of remote calls is much more complicated than the implementation of calls to local procedures.

To begin with, since the calling and called procedures are executed on different machines, they have different address spaces, and this creates problems when passing parameters and results, especially if the machines are not identical. Since RPC cannot rely on shared memory, this means that RPC parameters should not contain pointers to non-stack memory locations and that parameter values ​​should be copied from one computer to another.

The next difference between RPC and a local call is that it necessarily uses the underlying communication system, but this should not be explicitly visible either in the definition of the procedures or in the procedures themselves. Remoteness introduces additional problems. The execution of the calling program and the called local procedure on the same machine is implemented within a single process. But the implementation of RPC involves at least two processes - one in each machine.

In case one of them crashes, the following situations may arise: if the calling procedure crashes, the remotely called procedures will become orphaned, and if the remote procedures crash, the calling procedures will become orphaned parents, who will wait in vain for a response from the remote procedures. In addition, there are a number of problems associated with the heterogeneity of programming languages ​​and operating environments, the data structures and procedure call structures supported in any one programming language are not supported in the same way in all other languages.

These and some other problems are solved by the widespread RPC technology, which underlies many distributed operating systems. Basic RPC OperationsTo understand how RPC works, let's first consider making a local procedure call on a regular machine running autonomously. Let it be, for example, the system call count read fd,buf,nbytes where fd is an integer, buf is an array of characters, nbytes is an integer .

To make the call, the calling procedure pushes the parameters onto the stack in the reverse order of Figure 3.1. After the read call is executed, it places the return value into a register, moves the return address, and returns control to the calling procedure, which pops the parameters from the stack, returning it to its original state. Note that in C, parameters can be called by reference or by name , or by value. In relation to the called procedure, value parameters are initialized local variables.

The called procedure can change them without affecting the original values ​​of these variables in the calling procedure. If a pointer to a variable is passed to the called procedure, then changing the value of this variable by the called procedure entails changing the value of this variable for the calling procedure. This fact is very significant for RPC. There is also another mechanism for passing parameters that is not used in C. It is called call-by-copy restore and involves the calling program copying variables onto the stack as values, and then copying them back after the call is made over the original values ​​of the calling procedure.

The decision about which parameter passing mechanism to use is made by the language developers. Sometimes this depends on the type of data being passed. In C, for example, integers and other scalar data are always passed by value, and arrays are always passed by reference.

Rice. 3.1. a The stack before the read call is executed b The stack during the execution of the procedure c The stack after returning to the calling program The idea behind RPC is to make a call to a remote procedure look as similar as possible to a call to a local procedure. In other words, to make RPC transparent, the calling procedure does not need to know that the called procedure is on another machine, and vice versa. RPC achieves transparency in the following way.

When the called procedure is actually remote, instead of the local procedure, another version of the procedure, called the client stub, is placed in the library. Similar to the original procedure, the stub is called using the calling sequence as in Figure 3.1, and an interrupt occurs when accessing the kernel. Only, unlike the original procedure, it does not place parameters in registers and does not request data from the kernel; instead, it generates a message to be sent to the kernel of the remote machine. Stages of RPC executionThe interaction of software components when performing a remote procedure call is illustrated in Figure 3.2. After the client stub has been called by the client program, its first task is to fill the buffer with the message being sent.

In some systems, the client stub has a single fixed-length buffer that is filled from the very beginning with each new request. In other systems, the message buffer is a pool of buffers for individual message fields, some of which are already full.

This method is especially suitable for cases where the packet has a format consisting of a large number of fields, but the values ​​of many of these fields do not change from call to call. The parameters must then be converted to the appropriate format and inserted into the message buffer. At this point, the message is ready to be sent, so the kernel call interrupt is executed. Rice. 3.2. Remote Procedure Call When the kernel gains control, it switches contexts, saves processor registers and memory map page handles, and installs a new memory map that will be used to run in kernel mode. Because the kernel and user contexts are different, the kernel must copy the message exactly into its own address space so that it can access it, remember the destination address and possibly other header fields, and it must pass it to the network interface.

This completes the work on the client side.

The transmission timer is turned on, and the kernel can either cyclically poll for a response or pass control to the scheduler, which will select some other process to run. In the first case, query execution is accelerated, but multiprogramming is absent. On the server side, incoming bits are placed by the receiving hardware either in the built-in buffer or in RAM. When all the information has been received, an interrupt is generated.

The interrupt handler checks the packet's data for validity and determines which stub to pass it to. If no stub is expecting the packet, the interrupt handler must either buffer it or discard it altogether. If there is a waiting stub, the message is copied to it. Finally, a context switch is performed, as a result of which the registers and memory map are restored, taking the values ​​that they had at the moment when the stub made the receive call.

Now the server stub starts working. It unpacks the parameters and pushes them appropriately onto the stack. When everything is ready, a call to the server is made. After completing the procedure, the server transmits the results to the client. To do this, all the steps described above are performed, only in reverse order. Figure 3.3 shows the sequence of commands that must be executed for each RPC call, and Figure 3.4 shows what proportion of the total RPC execution time is spent on each of the 14 steps described.

The tests were conducted on a DEC Firefly multi-processor workstation, and while the presence of five processors necessarily affected the results of the measurements, the histogram shown in the figure gives a general idea of ​​​​the RPC execution process. Rice. 3.3. Stages of the RPC procedure Fig. 3.4. Time distribution between 14 stages of RPC execution 1. Call a stub 2. Prepare a buffer 3. Pack parameters 4. Fill in the header field 5. Calculate the checksum in the message 6. Interrupt to the kernel 7. Queue the packet for execution 8. Transfer the message to the controller via the QBUS bus 9. Transfer time over the Ethernet network 10. Receive a packet from the controller 11. Interrupt handling procedure 12. Checksum calculation 13. Context switching to user space 14. Performing a server stub Dynamic binding Let's consider the question of how the client specifies the location of the server.

One method to solve this problem is to directly use the server's network address in the client program.

The disadvantage of this approach is that it is extremely inflexible when moving the server, or increasing the number of servers, or changing the interface; in all these and many other cases, it is necessary to recompile all programs that used hard setting of the server address. In order to avoid all these problems, in Some distributed systems use what is called dynamic linking.

The starting point for dynamic binding is to formally define the server specification. The specification contains the file server name, version number and a list of service procedures provided by this server to clients (Figure 3.5). For each procedure, a description of its parameters is given, indicating whether this parameter is input or output relative to the server. Some parameters can be both input and output - for example, some array that is sent by the client to the server is modified there, and then returned back to the client operation copy restore . Rice. 3.5. RPC Server Specification The formal server specification is used as input to the stub generator program, which creates both client and server stubs.

They are then placed in the appropriate libraries. When a user client program calls any procedure defined in the server specification, the corresponding stub procedure is associated with the program binary code.

Likewise, when a server is compiled, server stubs are associated with it. When the server starts, its very first action is to transfer its server interface to a special program called binder. This process, known as the server registration process, involves the server transmitting its name, version number, unique identifier, and a descriptor of the server's location. The descriptor is system independent and can be an IP, Ethernet, X.500, or some other address.

In addition, it may contain other information, for example related to authentication. When a client calls one of the remote procedures for the first time, for example, read, the client stub sees that it is not yet connected to the server, and sends a message to the binder program with a request to import the interface of the desired version of the desired server. If such a server exists, then binder sends descriptor and unique identifier for the client stub.

When sending a message with a request, the client stub uses a descriptor as an address. The message contains parameters and a unique identifier that the server core uses to route the incoming message to the desired server if there are several of them on this machine. This method of importing and exporting interfaces is highly flexible. For example, there may be several servers supporting the same interface, and clients are randomly distributed among the servers.

Within the framework of this method, it becomes possible to periodically poll servers, analyze their performance and, in case of failure, automatically shut down, which increases the overall fault tolerance of the system. This method can also support client authentication. For example, the server may determine that it can only be used by clients from a specific list. However, dynamic binding has disadvantages, such as additional overhead and time spent exporting and importing interfaces.

The magnitude of these costs can be significant, since many client processes exist for a short time, and each time the process starts, the interface import procedure must be performed again. In addition, in large distributed systems, the binder program can become a bottleneck, and creating multiple programs with the same purpose also increases the overhead of creating and synchronizing processes. RPC semantics in the event of failures Ideally, RPC should function correctly in the event of failures.

Consider the following failure classes: 1. The client cannot locate the server, for example, if the desired server fails, or because the client program was compiled a long time ago and used an old version of the server interface. In this case, in response to the client's request, a message containing an error code is received. 2. The request from the client to the server was lost. The simplest solution is to repeat the request after a certain time. 3. The response message from the server to the client is lost.

This option is more complicated than the previous one, since some procedures are not idempotent. An idempotent procedure is a procedure whose execution request can be repeated several times without changing the result. An example of such a procedure is reading a file. But the procedure for withdrawing a certain amount from a bank account is not idempotent, and if the response is lost, a repeated request can significantly change the state of the client’s account.

One possible solution is to make all procedures idempotent. However, in practice this is not always possible, so another method can be used - sequential numbering of all requests by the client kernel. The server core remembers the number of the most recent request from each client, and upon receiving each request, it analyzes whether this request is a primary or a repeated one. 4. The server crashed after receiving the request. The idempotency property is also important here, but unfortunately the approach with request numbering cannot be applied.

In this case, it matters when the failure occurred - before or after the operation. But the client kernel cannot recognize these situations; it only knows that the response time has expired. There are three approaches to this problem: Wait until the server reboots and try the operation again. This approach ensures that the RPC is completed at least once, and possibly more. Immediately report the error to the application.

This approach ensures that the RPC is executed at most once. The third approach does not guarantee anything. When the server fails, no support is provided to the client. The RPC may either not be executed at all, or it may be executed many times. In any case, this method is very easy to implement. Neither of these approaches is very attractive. And the ideal option, which would guarantee exactly one RPC execution, in the general case cannot be implemented for reasons of principle.

Let, for example, a remote operation be printing some text, which includes loading the printer buffer and setting one bit in some printer control register, as a result of which the printer starts. A server crash can occur either a microsecond before or a microsecond after the control bit is set. The moment of failure entirely determines the recovery procedure, but the client cannot find out about the moment of failure.

In short, the possibility of a server crash radically changes the nature of RPC and clearly reflects the difference between a centralized and a distributed system. In the first case, a server crash leads to a client crash, and recovery is impossible. In the second case, it is both possible and necessary to perform system recovery actions. 1. The client crashed after sending the request. In this case, calculations are performed on results that no one expects. Such calculations are called orphan calculations. The presence of orphans can cause various problems: overhead of CPU time, blocking of resources, substitution of the answer to the current request with the answer to the request that was issued by the client machine before the system was restarted.

How to deal with orphans? Let's look at 4 possible solutions. Destruction. Before the client stub sends an RPC message, it makes a note in the log indicating what it will do next. The log is stored on disk or other fault-tolerant memory.

After the accident, the system is rebooted, the log is analyzed and the orphans are eliminated. Disadvantages of this approach include, first, the increased overhead associated with writing each RPC to disk, and, second, possible inefficiency due to the appearance of second-generation orphans generated by RPC calls issued by first-generation orphans. Reincarnation. In this case, all problems are solved without using disk recording. The method consists of dividing time into sequentially numbered periods. When the client reboots, it broadcasts a message to all machines to announce the start of a new period.

After receiving this message, all remote calculations are eliminated. Of course, if the network is segmented, then some orphans may survive. Soft re-incarnation is similar to the previous case, except that not all deleted calculations are found and destroyed, but only the calculations of the rebooting client. Expiration. Each request is given a standard period of time T within which it must be completed.

If the request is not completed within the allotted time, then an additional quantum is allocated. Although this requires additional work, if after a client crash the server waits for an interval T before rebooting the client, then all orphans are necessarily destroyed. In practice, neither of these approaches is desirable; in fact, destroying orphans may make the situation worse. For example, suppose an orphan has locked one or more database files.

If the orphan is suddenly destroyed, then these locks will remain, in addition, the destroyed orphans may remain standing in various system queues, in the future they may cause the execution of new processes, etc.

What will we do with the received material:

If this material was useful to you, you can save it to your page on social networks:

The idea of ​​calling remote procedures (Remote Procedure Call - RPC) consists of extending the well-known and understood mechanism for transferring control and data within a program running on one machine to transferring control and data over a network. Remote procedure call tools are designed to facilitate the organization of distributed computing.

The greatest efficiency of using RPC is achieved in those applications in which there is interactive communication between remote components With short response time And relatively small amount of transmitted data.Such applications are called RPC-oriented.

The characteristic features of calling local procedures are:

    asymmetry, that is, one of the interacting parties is the initiator;

    Synchronicity, that is, execution of the calling procedure is suspended from the moment the request is issued and resumed only when the called procedure returns.

Implementing remote calls is much more complicated than implementing local procedure calls.

1. Let's start with the fact that since the calling and called procedures are executed on different machines, they have different address spaces, and this creates problems when transferring parameters and results, especially if the machines are not identical.

Since RPC cannot rely on shared memory, this means that RPC parameters must not contain pointers to non-stack memory locations So what parameter values ​​must be copied from one computer to another.

2. The next difference between RPC and a local call is that it necessarily uses the underlying communication system, however this should not be clearly visible either in the definition of procedures or in the procedures themselves .

Remoteness introduces additional problems. Executing the calling program and the called local procedure on the same machine implemented withinsingle process. But involved in the implementation of RPCat least two processes - one in each car. If one of them fails, the following situations may occur:

    If the calling procedure crashes, the remotely called procedures will become "orphaned" and

    If remote procedures terminate abnormally, the calling procedures will become "destitute parents" and wait for a response from the remote procedures to no avail.

In addition, there is a number of problems associated with the heterogeneity of programming languages ​​and operating environments : Data structures and procedure call structures supported in any one programming language are not supported in the same way in all other languages.

These and some other problems are solved by the widespread RPC technology, which underlies many distributed operating systems.

The idea behind RPC is to make a remote procedure call look as similar as possible to a local procedure call. In other words, make RPC transparent: the calling procedure does not need to know that the called procedure is on another machine, and vice versa.

RPC achieves transparency in the following way. When the called procedure is actually remote, another version of the procedure, called a client stub, is placed in the library instead of the local procedure. Like the original procedure, the stub is called using a calling sequence (as in Figure 3.1), and an interrupt occurs when accessing the kernel. Only, unlike the original procedure, it does not place parameters in registers and does not request data from the kernel; instead, it generates a message to be sent to the kernel of the remote machine.

Rice. 3.2. Remote Procedure Call