Abstract— Human machine interface is used to accept theinput/instructions from the user and pass it to the computer/machine forprocessing. With the advent of interfacing technology it is possible to usevoice command for operating the machines. Display technology also had undergonea great revolution and it is possible to move the display along the directionof movement of user. In this work we propose a model that uses voice interfacefor command and holographic visual projection for display.
The model onimplementation will allow the user to control the machine via physical deviceor voice command. It will also allow the user to convey the voice message inthe form of encrypted text message through mail or as SMS. As the user turns display will also rotate so that userwill be able to see the display in front of him.
Keywords—Human machine interface; Displaytechnology; Holographicvisual projection ; Encryption. I. Introduction Combat vehicles are usually integrated with complex and multiple display screens within such a small space. Thedisplay technology here follows a series of technologies compatible with eachother to provide smart virtual display as human machine interface. The displaytechnology includes 3D holographic virtual display interfaced with sensor forhead movement detection. The 3D holographic virtual display is a mid-airholographic screen which enables the display of various control panels withouta physical screen.
Combatvehicle’s display need to be smart enough to enable the user give instructions orally.Since the crew need to monitor the outside and also ensure that the movement ofvehicle also happens in the manner in which he wants the vehicle to move. Consideringthe fact that crew need to do multitasking and has to keep moving in differentangles .It is required that the display should be smart enough to detect hisdirection of movement(head) and show the display accordingly. Thus the sensors for headmovement detection is interfaced that senses the head movement of the user andenables the screen to move as the user turns or moves.There are variousdifferent algorithms for head movement detection.One such method recovers 3-Dhead pose video with a 3-D cross model in order to track continuous headmovement. The model is projected to an initial template to approximate the headand is used as a reference.
Full head motion can then be detected in videoframes using the optical flow method. It uses a camera that is nothead-mounted. It can be considered a high complexity algorithm. One of the easiest interfacing isvoice recognition interface when compared to feeding the input bytyping.Now-a-days voice machine interfacing has become common. The crew stationhuman machine interface we propose uses voice command of the user as one of theinterface.
The entire display and the respective functions works on voicecommands.Voice recognition software captures and converts your speech via amicrophone. Voice recognition applications can transcribe recordings from anumber of formats. Everyone’s voice and phrasing sounds slightly different, sothe most effective programmes use a simple, one-off process called ‘enrolment’for the software to determine how you speak.Most voice recognition softwaregives you the ability to start, navigate and control your computer programmesthrough spoken commands. Theother important feature is speech to text conversion for communicating with themain base or the neighbor co-soldier.This helps to contact them in the form ortext or email.
The encryption algorithms are used for end user display ofmessages.Text can be edited very easily. You can highlight the text to bechanged by using commands such as “Select line” or “Select paragraph” andthen saying the changes you want to make to the selected text.In case of email,as the text is dictate,it formats the text in an email format.In case of SMSthe texts just transmits in the form of a simple text. End-to-end encryption(E2EE) is a system of communicating whereonly the communicating users can read the messages.The end-to-end encryptionparadigm does not directly address risks at the communications endpointsthemselves. Even the most perfectly encrypted communication pipe is only assecure as the mailbox on the other end.
Thus various encryption algorithms areused to secure these messages. Inthis paper we have discussed various technologies and algorithms used for crewstation human machine interface and display technologies along with end- to-end communication with encryption. II. Literature ReviewBased onthe various earlier research papers on our project, some of the papers appearsto be relevant.
In the paper “Automatic speech recognition:A review” 1,they have proposed that The process of speech recognition begins with a speakercreating an utterance which consists of the sound waves. These sound waves arethen captured by a microphone and converted into electrical signals. Theseelectrical signals are then converted into digital form to make themunderstandable by the speech-system.
Speech signal is then converted intodiscrete sequence of feature vectors, which is assumed to contain only therelevant information about given utterance that is important for its correctrecognition. An important property of feature extraction is the suppression ofinformation irrelevant for correct classification such as information aboutspeaker (e.g. fundamental frequency) and information about transmission channel(e.
g. characteristic of a microphone). Finally recognition component finds thebest match in the knowledge base, for the incoming feature vectors. Sometimes,however the information conveyed by these feature vectors may be correlated andless discriminative which may slow down the further processing.
Featureextraction methods like Mel frequency cepstral coefficient (MFCC) provides someway to get uncorrelated vectors by means of discrete cosine transforms (DCT).In the paper “AN EFFICIENT SPEECH RECOGNITION SYSTEM” 2, Suma Swamy and K.VRamakrishnan have proposed that Feature extraction is a process thatextracts data from the voice signal that is unique for each speaker. MelFrequency Cepstral Coefficient (MFCC) technique is often used to create thefingerprint of the sound files. These extracted features are Vectorquantized using Vector Quantization algorithm. Vector Quantization (VQ) is usedfor feature extraction in both the training and testing phases. After featureextraction, feature matching involves the actual procedure to identify theunknown speaker by comparing extracted features with the database using theDISTMIN algorithm. Also, Hidden Markov Processes are the statistical models inwhich one tries to characterize the statistical properties of the signal withthe underlying assumption that a signal can be characterized as a randomparametric signal of which the parameters can be estimated in a precise andwell-defined manner.
In order to implement an isolated word recognition systemusing HMM, the following steps must be taken:(1) For each uttered word, a Markov model must be built using parameters thatoptimize theobservations of the word.(2) Maximum likelihood model is calculated for the uttered word.In the paper ,” Speech Recognition as Emerging Revolutionary Technology”3,the authorshave proposed that Speech recognition is the translation of spoken words intotext.
It is also known as “automatic speech recognition”,”ASR”, “computer speech recognition”, “speech totext”, or just “STT”. Speech Recognition is technology that cantranslate spoken words into text. Some SR systems use “training”where an individual speaker reads sections of text into the SR system. Thesesystems analyze the person’s specific voice and use it to fine tune therecognition of that person’s speech, resulting in more accurate transcription.Also, Both acoustic modelling and language modelling are important parts ofmodern statistically-based speech recognition algorithms.
Hidden Markov models(HMMs) are widely used in many systems. Language modelling has many otherapplications such as smart keyboard and document classification. The paper, “End-to-end Encrypted MessagingProtocols: An Overview” 4, aims at giving an overview of the different core protocolsused for decentralized chat and email-oriented services. This work is part of asurvey of 30 projects focused on decentralized and/or end-to-end encryptedinternet messaging, currently conducted in the early stages of the H2020 CAPSproject NEXTLEAP.
They have used various email and chat protocols such as SMTPand XMPP for their work. Also in the paper, “Messengr: End-to-EndEncrypted Messaging Application With Server-Side Search Capabilities” 5,they haveimplemented a proof-of-concept app that provides strong end-to-end encryptionfor chats and allows users to search through their encrypted messages withoutthe server learning the contents any message nor what keyword the user searchedfor In the paper,”END TO END ENCRYPTION: ANANSWER TO SECURITY CONCERNS IN THE PRIVATE SECTOR”, 6 Erik Wehner explainedthat the algorithm used to encrypt data is one of the most important steps, asthis is what determines how difficult it is for the publicly transmittedencrypted data to be decrypted by an unwanted party. One of the most commonlyused cipher algorithms is the Advanced Encryption Standard (AES). A NationalInstitute of Standards and Technology (NIST) publication released on November26, 2001, outlined AES to be the standard for securing sensitive informationwithin Federal departments and agencies. According to this publication, AES isa specified form of the Rijndael algorithm, which is a block cipher that canprocess data blocks of 128 bits using cipher keys with lengths of 128,192, or256 bits. These cipher keys are typically obtained from the Diffie-Hellman keyexchange mention earlier. Next, the text is converted into hexadecimal formatand stored in arrays, so it can be interpreted by the computer.
From there, thedata that is to be encrypted undergoes four transformations for every round.The number of rounds is decided by the length of the cipher key mentionearlier, and are 10, 12, and 14 in order of increasing key length. All of thesesteps work to scramble the data as much as possible, making a user’s data evenmore difficult to intercept In the paper,”Full-Parallax HolographicLight-Field 3-D Displays and Interactive 3-D Touch” 7, Masahiro Yamaguchi have explained thatHuman–computer visual interfaces are evolving toward more natural and intuitiveinteractions, e.g., from multitouch to gesture, and from 2-D to 3-D. Combininga 3-D display and a gesture interface allows direct interaction with an imagereproduced in 3-D space, which makes interaction easier and more enjoyable.Inthis system, the reproduced 3-D images and the 3-D touch detection areassociated with each other, and thus, we do not have to worry about thecomplicated registration between them. The identification of the user’sinteraction is simple, because the color information of the 3-D image can be usedfor this purpose.
Some experimental results of the 3-D touch-sensing displayare introduced, and possible applications of this technology are discussed aswell. In the paper,”Holographic 3D Touch SensingDisplay “8,MasahiroYamaguchi and Ryo Higashida have proposed that a 3D image floating in the airis reproduced by a 3D light-field display using a holographic screen, and a 3Dtouch interface is implemented by detecting the touch to the reproduced realimage. A simple 3D touch sensing experiment based on the proposed method isdemonstrated.The holographic screen consists of a 2D array of small elementaryholograms that reproduce diverging light .
It is produced by the optical systemof holographic 3D printer without exposing 3D image. The holographicscreen is illuminated by the light from a projector, where the projected imagecontains the light-ray information similar to the one used in integral imaging.The holographic screen reconstructs a 2D array of diverging light-rays, whichare modulated by the projected image, and it works as a full-parallaxlight-field 3D display.
The system for 3D touch sensing display is shown in fig.2. A 3D real image is reproduced by the display presented in the previoussection, and a user can touch the floating image. Then the fingertip thattouches the real image is colored by the image, and it is detected by thecamera behind the holographic screen.
If the color of the detected light agreeswith the image reproduced by the 3D display, then the fingertip is detected. Bymonitoring the difference between the successive frames, the background patterncan be almost removed in the images captured by the camera. In a paper named,”Head motion controlledpower wheelchair” 9,Kupetz et al. implemented a head movement tracking systemusing an IR camera and IR LEDs.
It tracks a 2×2 infrared LED array attached tothe back of the head. LED motion is processed using light tracking based on avideo analysis technique in which each frame is segmented into regions ofinterest and the movement of key feature points is tracked between frames . Thesystem was used to control a power wheelchair. The system needs a power supplyfor the LEDs which could be wired or use batteries. The system can be improvedto detect the maximum possible unique head movements to be used in differentapplications. It needs more experiments to prove its accuracy in addition tomore theoretical proofs. In the paper,”Head movementrecognition based on LK algorithm and Gentleboost” 10,Jian-zheng and Zheng presented a mathematical approach usingimage processing techniques to trace head movements.
They suggested that thepattern of the head movements can be determined by