Open standards have been deployed in Kaizen Voice Authentication, through Java and Python. It works with most of the popular databases like DB2, MS SQL, Oracle.
Of course, the system results vary due to noise/network disturbances. Too much noise is the only enemy. Users must be in a less noisy environment to allow successful enrollment and authentication of a human voice. We recommend using other factors in addition to voiceprints if it is a noisy work environment
These are very well established biometrics options. They offer good results, but they need specific readers/ scanners at all access points. Most of the access equipment wears out in a few months, to reduce the effectiveness score. Dust, moisture, wear-tear, low maintenance affects the results of a scan. Voice biometrics will work through any mobile/landline and does not need any specific equipment.
It is highly aware of “liveness” [actual human participating in the authentication], less intrusive, inexpensive and scores very high on EER [equal error rate]. Go for it.
The usual customer authentication by a live help desk agent needs nearly 90 seconds. The first 40 secs are spent choosing the IVRS options or problem description, later 40-50 secs on quoting the relevant PIN/date of birth/answers to secret questions.
This is when a customer calls into a contact center and seeks help through the contact center executive. Instead, if we deploy the Voice Authentication, the authentication is completed in 10 secs and allows for faster resolution. This increases customer satisfaction rates and reduces AHT (Average Handling Time)/contact center billing costs
– Active authentication is where the customer is aware of the process, voluntarily participates in the authentication and provides time for the same. This is before he is connected to a live agent/contact center axis
- Passive authentication is where the customer has contacted the service help desk, is describing his/her problem and the biometric server runs an authentication check in the background, as he speaks. The help desk may encourage the customer to speak for more than 10 secs, to allow for correct authentication/recognition against voiceprints. This passive system needs more ports, E1/PRI lines on the CTI to allow for real-time processing. This is resource hungry.
National Institute of Standards and Technology (NIST) Special Publication 800-63-2 discusses various forms of two-factor authentication and guides on using them in business processes requiring different levels of assurance. We adhere to all the norms and comply with
We adhere to all the norms and comply with NIST standards. Customers can seek 3rd party validation of the same.
NIST standards. Customers can seek 3rd party validation of the same.
KSV has published web services for integration touch-points with other 3rd party apps like CRM, IVRS and ERP. During the deployment, voice-stream [min 9 Sec string] of users is routed from the IVRS port to the cache of CTI, later to the biometric server. This is an established best practice. Media gateway through the SIP protocol manages all other streams [VOIP] and sends them to the voice biometric server.
Do you want to deliver the project fully? Yes, that can be organized. Partners can focus on selling the product initially in a chosen territory. If they can invest in resources, KSV team will train them on best practices and involve in one project implementation, working alongside the KSV implementation team. Learn, get certified and get independent on implementation.
Gaussian mixture models are a probabilistic model for representing normally distributed subpopulations within an overall population. Mixture models in general don’t require knowing which subpopulation a data point belongs to, allowing the model to learn the subpopulations automatically. Since subpopulation assignment is not known, this constitutes a form of unsupervised learning.
For example, in modeling human height data, height is typically modeled as a normal distribution for each gender with a mean of approximately 5’10” for males and 5’5” for females. Given only the height data and not the gender assignments for each data point, the distribution of all heights would follow the sum of two scaled (different variance) and shifted (different mean) normal distributions. A model making this assumption is an example of a Gaussian mixture model (GMM), though in general, a GMM may have more than two components. Estimating the parameters of the individual normal distribution components is a canonical problem in modeling data with GMMs.
GMMs have been used for feature extraction from speech data, and have also been used extensively in object tracking of multiple objects, where the number of mixture components and their means predict object
Locations at each frame in a video sequence.
Relatively stable characteristics :
- Vocal tract length
- Vocal tract shape
- Vocal cord length (pitch)
- Gender (Breathiness)
- Nasal cavity size and shape
- Speaking rate and prosody
- Language, dialect, and accent
Transient characteristics :
- Emotional state
Done in a jiffy. KSV has published APIs for common web services and data call out from other apps. Partners/ customer IT teams can do it themselves or our delivery team can do it for you. Easily available.
You mean the customer wants other biometrics? No worries, we can do that. Fingerprinting, facial recognition is two more options that can be provided directly by KSV. We can configure other 3rd party solutions if needed to complement Voice Authentication and provide a comprehensive solution to end customers.
Yes, it is platform agnostic. It will work on all server O/S
Geo-tagging is an app that is installed on the user’s mobile phone. This picks up the latitude-longitude with the time-stamp, while the voice authentication is done. Instead of a fixed line, this app can serve as the proof of fixed latitude-longitude coordinates, via a mobile phone-based voice authentication.
- Number of customers/user based blocks, shall be the base for a product license fee.
- One-time setup cost and implementation costs.
- AMC for product license at 15%
Yes, we are seeking a Microsoft technology center in Bangalore as one such POC testbed. Customers can walk-in for an immersion experience of the product and surround technologies like database, datacenter automation, IVRS, networking, etc.
This is being made available. Test reports can be shared with registered partners and customers under NDA
Yes, this will be made available through Azure, AWS, and other service providers. Hosted/dedicated instance, cloud or SaaS options available. Customers can pay USD/INR per user per month for a complete product suite.
Plain vanilla installation and configuration of voice biometrics application will take less than 3 weeks. Integration with other applications /CTI/IVRS will be based on our effort estimation. It will vary. Web services based APIs can be used to shorten this integration.
Yes, you can do that. This is independent of language and text.
A noisy work environment will lead to a high rate of false rejection. Customers must be in a less noisy environment, use a mobile phone in normal mode, and not use the loudspeaker mode. Authentication depends on clean reception of voice by the biometric engine
No, the database captures and retains only the parameters of a voice in the Voiceprint. There is no recorded voice or audio file in the biometrics engine, that can be hacked. Hence no-one can copy/re-create any audio file.
No, the database captures and retains only the parameters of a voice in the voiceprint. There is no recorded voice or audio file in the biometrics engine, that can be hacked. Hence no-one can copy/re-create any audio file.
No, they can’t. The human voice has 12 unique parameters like pitch, tone, breathlessness, concatenation, etc. Mimicry may match 2-3 parameters of another person’s voice. Hence it is rejected during authentication. Prerecorded voice will have different waveforms and is easily identified as a false attempt, leading to rejection.
First Time Enrollment:
This is a one-timed exercise for any user/customer and will not take more than 30 seconds of speaking into the phone. They can speak in any language/any text. This enrollment creates a customer voiceprint.
Every time an enrolled customer calls into the specified IVRS/ contact center number, the biometrics engine needs only 7 seconds to authenticate against a database of voiceprints.
Savings on call center costs, bandwidth, manpower costs, and cost of a data breach. GDPR guidelines or banking regulatory authority guidelines can be better met through multi-factor authentication.
Retain high ARPU customers, save on contact center costs, ensure low ARPU clients use only IVRS option, and not cut into call center time / AHT, GDPR compliance.
Avoid invalidated payouts to pensioners, facilitate better compliance to proof of life process, save time and costs of enrollment of the new batch of retiring personnel, seek monthly proof of life at no extra cost instead of annual proof.
Avoid invalidated payouts to pensioners, facilitate better compliance to proof of life.
Process, save time and costs of enrollment of the new batch of retiring personnel, seek monthly proof of life at no extra cost instead of annual proof.
Not for Profit /NGO:
Control fund flow from trust /donor to field implementation agencies, track beneficiaries every month, track them as alumni and map them against expected impact assessment, ensure money spent on social dev projects are validated in real-time.
Real-time attendance mapping of field staff with Geo-tagging, ensure scheduled shifts don’t slip up, the customer’s customer is happy and retains the contracted business, save manpower costs in manual monitoring of staff, cut costs and align field attendance with a timestamp to actual salary calculations.
Yes, it can be used as a privilege or differentiator for banks or other organizations. Allow high net worth account holders or high-value customers to use this tool.
The system can be configured for the desired number of attempts. For E.g. a validated user who gets rejected can be allowed for 2 more attempts or re-directed to the OTP pin generation or to speak to a customer service desk. It can be configured to give 3-5 attempts for re-authentication.
Yes, it is scientifically possible. Our product roadmap has a high vector algorithm in the beta test stage now. The acceptance rate is closer to zero.
Cold, nasal congestion may affect 2 parameters out of 12 unique distinctions of the human voice. The scoring system will allow the user to log-in, even if the 2 parameters are not acceptable, for that specific day. This setting is done by the implementation team and can be set.No initial investments on scanners/readers as it is voice/telephone-based high or low, based on customer preferences.
Each user will have to determine and choose the ratio suiting the risk profile. FAR cannot be zero and we are testing a high vector algorithm to make it a 0.01% failure rate. Every customer needs to understand the Voice Authentication application to their industry needs.
The percentage of validation attempts in a biometric system that is either Falsely Accepted or Falsely Rejected where the probability of acceptance and rejection are equal. EER is a trade-off or a ratio between FAR and FRR. The false acceptance rate is allowing unrecognized/unwanted customers to log in and False rejection rate is rejecting previously validated users or preventing valid customers from entering the system. Customers can choose to fix their ratio in favor of any one of them. It KSV team will configure the system to suit customer needs and also monitor this.
KSV is a new organization with unique IPRs. The KSV Secure has references in SRI, USA, Lebara UK. We are striving towards achieving top references in govt, bank, defense, NGO trusts, and other sectors. We will share names as and when pilot implementation starts.