Wednesday, April 24, 2019

Onsite Interactive Training Session for Data Analysts/Data Scientists

Dr. Zhou is offering a brand new service: a half-day (3 hours) interactive training session to help a company's data analysts/data scientists improve their skills and productivity. The format of the interactive training session is as follows:

1. Dr. Zhou first gives a 1.5 hours presentation to describe successful cases of using data analytics to solve business problems, share best practices, and talk about challenges.
2. For the remaining 1.5 hours, the audience asks questions and is actively engaged in discussion with Dr. Zhou.  This the best part!  As one of the scientists said, "Dr. Zhou has helped me solve an issue that I had struggled with for years".

The training session is designed for all data analysts/data scientists. Dr. Zhou shares his battle-tested strategies and best practices that are useful for them regardless what specific analytics tools or programming languages they use.

On April 23, 2019, Dr. Zhou delivered the interactive training session to one of the top three property insurance company located in Boston. It was extremely well received. The picture attached shows that I am making the presentation (in addition to people in the room, there are more people joining the meeting through the phone.) 

The following are the testimonials from two persons who took the course.

"We have learnt a lot from Dr.Zhou'", Gang Xu, Director, Data Science at Lincoln Financial Group
"Dr. Zhou gave us a great overview of procedures of doing a solid predictive analysis and illustrated real life AI consulting business cases. Dr. Zhou is really experienced in the AI space and his presentation was very well received by data scientists from Lincoln Financial Group. I would highly recommend any data science group to have Dr. Zhou sharing his experiences. "-  Dr. Hao Zhou, Principal Analyst,  Data Science at Lincoln Financial Group

The following are testimonials about my previous talks and training activities.

"It was a fortune to have Jay come to our computer science department to share his experience in solving business problems with predictive analytics on February 28, 2017. What Jay had presented in his 3 talks, each lasting for 1 hour in different topics of data mining, was totally impressive and beyond our wildest expectation. Having built competition-winning predictive models for some of the biggest companies and produced hundreds of millions of dollars’ savings, Jay shared the secret of his success with students and faculty without reservation. His strong presentations were such an inspiration for our computer science students and faculty and his methodology was innovative and powerful , even for very seasoned data scientists among the audience. Jay, thank you so much for your hard work preparing and delivering these presentations! " - Dr. Wei Ding, Professor at University of Massachusetts Boston

"Jay is more than just a coder, he is a great trainer, and a good presenter of theoretical data mining concepts so that they can be understood by most. "-James Lukenbill, Director of IT Project Management, Optum

Bio of Dr.Jiang Zhou
Dr. Jiang Zhou has two decades of experience building predictive models across industries including telecommunication, banking, insurance, and smart city. These solutions have resulted in over $200 million savings for clients. Dr. Zhou has been involved in three real world competitions to build best predictive models, i.e., a customer credit risk model for a top three cell phone company, a bank card fraud detection model for a top 15 bank, and a direct sales model for a marketing company. Dr. Zhou's models have won all these competitions. He has founded/co-founded data analytics companies, including Business Data Miners, Smart Credit and AI Strike. Previously, he was a chief statistician at Lightbridge, a vice president at Citizens Bank and a consulting member of technical staff at Oracle. Dr. Zhou is the author of an award-wining blog on data analytics

The normal price for the training service is $6,500. If your company is interested in the service, please contact Dr. Zhou at 

Wednesday, April 17, 2019


Workshop on Data Mining in Industrial Internet of Things (DMIIOT)
to be held in conjunction with the IEEE International Conference on Data Mining 2019 in Beijing
Data generated by industrial internet of things (IIOT) have been growing at an exponential rate. Data mining plays an essential role in deriving actionable information from these raw data. By applying a variety of data mining technologies to historical and real time IIOT data, building supervised or unsupervised models, deploying them into the production environment to help business make better decisions, significant value can be created resulting in reduced waste, improved efficiency and broaden opportunity.  The marriage between data mining and IIOT has found applications in industries such as manufacturing, energy, healthcare, retail, smart city and transportation.
The workshop will provide a venue for researchers and practitioners from both data mining and IIOT communities to exchange ideas, share best practices, discuss challenges and future directions. By fostering communication and collaboration, we drive innovative applications of data mining to IIOT. This workshop will be held along with 2019 IEEE International Conference on Data Mining, Beijing (
 This workshop calls for papers that cover topics including, but not limited to, the following:
  • Data mining algorithms for IIOT
  • Data mining architectures in IIOT environment
  • Data mining applications in areas including manufacturing, energy, healthcare, smart city, transportation, etc.
  • Best practices, challenges and future developments
This workshop would like to call for research papers sharing the experiences from the real data and real-world practice. We do not require technical innovations (using existing data mining techniques is totally acceptable). All accepted workshop papers will be published in formal proceedings published by the IEEE Computer Society Press indexed by EI.

Paper Submissions:

Deadline: 7 August 2019, 11:59PM Pacific Time.


ICDM workshop follows the same submission requirement as ICDM papers.
  • Long paper (up to 8 pages) and short paper (up to 4 pages). The page limit includes the bibliography and any possible appendices.
  • All papers must be formatted according to the IEEE Computer Society proceedings manuscript style, following IEEE ICDM 2019 submission guidelines available at:
  • Papers should be submitted in PDF format, electronically, through email to
  • All accepted papers will be included in the IEEE ICDM 2019 Workshops Proceedings volume published by IEEE Computer Society Press and will also be included in the IEEE Xplore Digital Library. Therefore, papers must not have been accepted for publication elsewhere or be under review for another workshop, conferences or journals.

Paper Submission Deadline: 7 August 2019, 11:59 PM Pacific Time.
Paper Notification: 4 September 2019
Camera Ready Version: 8 September 2019
Workshop: 8 November 2019
Steering Committee
  • Dr. Jay Lee (Univ. of Cincinnati)
  • Dr. Qi Li (Peking University) 
  • Dr. Shaofu Lin (Beijing University of Technology)
  • Dr. Shyam Varan Nath (Oracle)
  • Dr. Richard Mark Soley (Industrial Internet Consortium)
  • Dr. Honggang Wang (University of Massachusetts- Dartmouth)
  • Dr. Jiansheng Zhang (Tsinghua University)

Program Committee
  • Dr. Zhaoheng Gong (Harvard University)
  • Dr. Jay Lee (Univ. of Cincinnati)
  • Dr. Shaofu Lin (Beijing University of Technology)
  • Mr. Song Luo (China Academy of Information and Communication Technology)
  • Dr. Shyam Varan Nath (Oracle)
  • Dr. Honggang Wang (University of Massachusetts- Dartmouth)
  • Dr. Jiansheng Zhang (Tsinghua University)

Workshop Chairs
  • Dr. Ping Chen (University of Massachusetts-Boston) 
  • Dr. Jiang (Jay) Zhou (AI Strike LLC)

Thursday, March 07, 2019

From Hype to Reality – Powering the AI-Driven Future of Insurance at Insurance AI and Analytics USA - by Ira Sopic

With 2018 witnessing unprecedented advances in the investment and deployment of artificial intelligence within the insurance industry, Insurance Nexus is delighted to announce that the Insurance AI and Analytics USA Summit will return to Chicago for a sixth time in 2019, welcoming more than 450 senior attendees to the Renaissance Chicago Downtown Hotel, May 2-3.
Featuring an agenda designed to tackle the biggest challenges and opportunities in AI and advanced analytics, Insurance AI and Analytics USA is a must-attend for any analytics, underwriting, claims or marketing innovators seeking to both achieve efficient and seamless operations and deliver valuable and relevant products and experiences. 

It’s impossible to open a magazine without seeing hype about analytics changing every aspect of your life,” says Will Dubyak, VP Analytics for Product Development & Innovation, USAA. “The Insurance AI & Analytics USA Summit is the optimal place to cut through the noise, hear the latest thinking from industry leaders in analytics, and compare best practices with your colleagues
Across three in-depth tracks, more than 40 expert speakers from leading North American carriers will explore and discuss the latest strategies and approaches being deployed to maximize the impact of AI, machine learning and advanced analytics across the insurance value chain.
Featuring a whole session dedicated to case studies, the practical retelling of success stories will ensure attendees discover how, and where, technological innovations are having the biggest impacts on insurance and walk away with a holistic roadmap for success.
Confirmed speakers so far include Tilia Tanner, Global Head of Analytics, AIG, Eugene Wen, VP of Group Advanced Analytics, Manulife and Jerry Gupta, SVP, Digital Analyst Catalyst, SwissRe, as well as:
  •        Thomas Sheffield, SVP and Head of Specialty Claims, QBE
  •          Glenn Fung, Chief Research Scientist, AI and Machine Learning Research Director, American Family Insurance
  •         Laurie Pierman, Vice President, Claim Operations, Amerisure Insurance
  •          Michiko Kurahashi, Chief Marketing Officer, AXIS Capital

Attendees to Insurance AI and Analytics USA will also become part of a truly international insurance community, with over 25 hours of networking and interactive discussions aplenty. In addition, our ‘Open Design Workshops’ will see attendees attempt to live-solve industry challenges, giving insight into how peers and competitors alike approach a challenge, and how their own methods might be improved.

At QBE, we’ve spent a great deal of time figuring out how we can strategically deploy artificial intelligence in practical use cases to drive immediate value for business,” states Ted Stuckey, Managing Director, QBE Ventures. “We’re excited to share some of our experience at the Insurance AI and Analytics Conference in Chicago on May 2-3!”

In short, however you are seeking to leverage AI, Insurance AI and Analytics USA is the event for you. Don’t miss this unparalleled opportunity. Join us in making 2019 the year AI insurance changes, forever.

Ira Sopic

Tuesday, January 15, 2019

About Dr. Zhou's Oracle SQL for Data Science Course

On January 31, 2017, I was invited by Prof. Wei Ding at the Department of Computer Science, University of Massachusetts Boston, and gave 3 talks about my data science projects across different industries. These talks are extremely well received. The following is what Prof. Ding says about my talks.

"It was a fortune to have Jay come to our computer science department to share his experience in solving business problems with predictive analytics on February 28, 2017. What Jay had presented in his 3 talks, each lasting for 1 hour in different topics of data mining, was totally impressive and beyond our wildest expectation. Having built competition-winning predictive models for some of the biggest companies and produced hundreds of millions of dollars’ savings, Jay shared the secret of his success with students and faculty without reservation. His strong presentations were such an inspiration for our computer science students and faculty and his methodology was innovative and powerful, even for very seasoned data scientists among the audience. Jay, thank you so much for your hard work preparing and delivering these presentations!" -Prof. Ding Wei, Department of Computer Science, University of Massachusetts Boston

The audience are particularly amazed by how I come up with solutions using Oracle SQL environment. To share my expertise, I create the online course Oracle SQL for Data Science to show how to perform common data science tasks using Oracle SQL and the benefits for doing that.

I let Charlie Berger,Senior Director of Product Management, Machine Learning, AI and Cognitive Analytics at Oracle know about my course and he told me "Your course is amazing."

Deep Learning World

The premier conference covering the
commercial deployment of deep learning

DeepLearning World is the premier conference covering the commercial deployment of deep learning. The event’s mission is to foster breakthroughs in the value-driven operationalization of established deep learning methods. DLW runs parallel to the established PredictiveAnalytics World for Industry 4.0 at the same venue. Combo passes are available.

How to turn Deep Tech into Broad Application

The hype is over: deep learning enters the “trough of disillusionment”. Companies are realizing that not every business problem requires the deep learning hammer. Of course, there are use cases that are best solved with artificial neural networks: image, speech and text recognition; anomaly detection and predictive maintenance on sensor data; complex data synthesis and sampling; reinforcement and sparse learning; and many more applications show the potential of artificial intelligence for real-world business scenarios. At the Deep Learning World conference data science experts present projects that went beyond experimentation and prototyping and showcase solutions that created economic value for the company. The case study sessions will focus on how it worked and what didn’t work while the deep dive sessions will explain topics such as RNN, CNN, LSTM, transfer learning and further in analytical and technical detail. Meet the European deep learning community in May in Munich and learn from well-known industry leaders! blog readers receive 15% discount with code: DDMPAWDLW

Predictive Analytics World for Industry 4.0

6-7 May, 2019 – Munich
Predictive Analytics World is the leading vendor independent conference for applied machine learning for industry 4.0.
Business users, decision makers and experts in predictive analytics will meet on 6-7 May 2019 in Munich to discover and discuss the latest trends and technologies in machine & deep learning for the era of Internet of Things and artificial intelligence.

Putting Machine Intelligence into Production

Smart Factory, Smart Supply Chain, Smart Grid, Smart Transport: artificial intelligence promises an intelligent and fully automated future but reality is: most machines, most vehicles and most grids lack sensors and even where sensors do exist they might not be connected to the Internet of Things. Many companies invested in their infrastructure and are experimenting with prototypes e.g. for predictive maintenance, dynamic replenishment, route optimization and more, but even if they succeeded in delivering a proof of concept they face the challenge to deploy their predictive model into production and scale their analytics solution to company wide adoption. The issues are not merely analytical but a combination of technical, organisational, judicial and economic details. At the Predictive Analytics World for Industry 4.0 experienced data scientists and business decision makers from a wide variety of industries will meet for two days to demonstrate and to discuss dozens of real-world case studies from well-known industry leaders. In addition, predictive analytics experts will explore new methods and tools in special deep dive sessions in detail. Finally, the Predictive Analytics World is accompanied by the Deep Learning World conference, which focuses on the industry and business application of neural networks. Take the chance, learn from the experts and meet your industry peers in Munich in May!
 Hot topics on the 2019 Agenda:
  • Predictive Maintenance & Logistics
  • Anomaly Detection & Root Cause Analysis
  • Fault Prediction & Failure Detection
  • Risk Management & Prevention
  • Route & Stock Optimization
  • Industry & Supply Chain Analytics
  • Image & Video Recognition
  • Internet of Things & Smart Devices
  • Stream Mining & Edge Analytics
  • Machine ~, Ensemble ~ & Deep Learning
  • Process Mining & Network Analyses
  • Mining Open & Earth Observation Data
  • Edge Analytics & Federated Learning
… and many more related topics

PredictiveAnalytics World 4.0 will be co-located with Deep Learning World, the premier conference covering the commercial deployment of deep learning in 2019. blog readers receive 15% discount with code: DDMPAWDLW

Tuesday, January 08, 2019

Analytics & AI in Travel North America

Analytics & AI in Travel North America launched by EyeForTravel will take place on March 14-15 at the Hilton Parc 55 Hotel, San Francisco, USA. There will be over 350 senior data, analytics, pricing, product development and digital marketing experts from the world’s leading travel companies, the event will explore the strategies for brands to address the biggest opportunity right now – how to conquer hyper-personalization.
Confirmed speakers include Hilton’s SVP of Analytics, Google’s head of AI global product partnerships, Expedia’s Head of Platform – Loyalty, Wyndham Hotel Group’s Vice President of Global Revenue Management Operations and Sales, Carlson Wagonlit’s Principal Data Scientist, and many more.

Attendees can expect to explore insights into the following:

• Harnessing AI and Data to Transform your Loyalty Strategy: Discover how weaving AI into your business, capturing preference data and delivering a truly personalized service will give you the edge in winning loyal customers from your competition

• Overcoming Pricing Peril with Personalized & Real-Time Revenue Generation Tactics: Make the shift to real-time pricing on an individual level, Nail down the use-cases of how to overcome this, forecast like a pro and optimize direct revenue

• Getting Up Close and Personal with the Customer and Capitalize on Every Channel: Use AI to fuel CRM and CS to bring customer data to life at every touchpoint, use the rich and famous on social to avoid brand erosion and secure market share.

• Immersing Yourself in an AI-driven Predictive Future to Seize New Profits: It all comes down to being predictive if you want to turn new profits. Deliver AI-led futures in your company for more efficient internal mechanics and travel customer-centricity

• Driving Real-Time, Hyper-Personalization to Move Your Profit Needle: Delve into new levels of granularity, become the Amazon of travel and deliver the perfect travel itinerary every time for unstoppable loyalty

• Seizing Voice, AR and VR Makes You Grab that Conversion: Be part of the lucky few that benefits from voice enabled search, drive direct bookings and use AR and VR to give your customer the confidence to convert

• Dominating Direct Bookings Through A Mastery of Mobile: Create an AI-enabled mobile product that drives direct bookings, focus on UI and UX that screams out loyalty and bolster your bottom line

• Outclassing your Competition with Total RM and Surge Ancillary Sales: Build state-of-the-art infrastructure that supports ancillary revenue and squeeze every ounce of profit from all revenue streams

Please check out the following icon for more information.

Thursday, December 20, 2018

Could Not Connect to Amazon RDS Oracle Database From Car Dealer WiFi

I am trying to connect to my Oracle database on Amazon RDS at a car dealer while my car is in service. My laptop is connecting to the public WiFi. When I try to connect to the Oracle server, I got "Error Message = IO Error: The Network Adapter could not establish the connection".

I realized the issue is caused by the new ip address not included Amazon Security Group inbound rules. I find my ip address. Then I log onto Amazon AWS console and find the security group associated with the DB instance. After I add a inbound rule, I am able to connect to the DB immediately.

Wednesday, December 19, 2018

Statistically Manufactured Personal Data

To avoid the trouble of dealing with personal data when we test our analytics processes, I have created mock personal data that closely reflect American population from statistical point of view. The largest data set has 1 million records with variables including first name, last name, sex, date of birth, social security number, address, phone number and email. The values of these variables are produced to be as realistic as possible to real American population. They represents about 0.33% of population in the United States.

These observations about the data 1 million mock personal data records are very close to the real statistics of the population in USA.

1.The top 4 states that have the most people are: California(138223 persons, %13.82), Texas(99217 persons, %9.92), Florida(69640 persons, %6.96) and New York(49979 persons, %5). These are close to the real distribution of the population in USA.
2. The female are 51% and the male are 49%.
3. Top 3 last names are Smith(10800 persons, %1.08), Williams(8000 persons, %.8) and Jones(6900 persons, %.69).
4. Top 3 female first names are Ava(4707 persons, %.93), Olivia(4508 persons, %.89) and Isabella(4311 persons, %.85) and top 3 male first names are Noah(5075 persons, %1.03), Elijah(4736 persons, %.96) and Liam(4434 persons, %.9).
5. The following table shows distributions of persons by age for both sexes. Women live longer than men.
                        Female           Male
Age Group        #        %       #  % 
   .Under 5 years 34603 6.81% 35656 7.25%
   .5 to 9 years 34707 6.83% 34010 6.92%
   .10 to 14 years 30192 5.94% 33013 6.72%
   .15 to 19 years 34361 6.76% 32689 6.65%
   .20 to 24 years 32512 6.39% 36647 7.45%
   .25 to 29 years 35626 7.01% 37278 7.58%
   .30 to 34 years 34344 6.76% 31977 6.50%
   .35 to 39 years 33325 6.55% 31927 6.49%
   .40 to 44 years 33332 6.56% 34456 7.01%
   .45 to 49 years 35070 6.90% 35443 7.21%
   .50 to 54 years 37321 7.34% 34876 7.09%
   .55 to 59 years 31623 6.22% 31315 6.37%
   .60 to 64 years 28801 5.67% 24218 4.93%
   .65 to 69 years 20999 4.13% 19881 4.04%
   .70 to 74 years 16617 3.27% 14065 2.86%
   .75 to 79 years 13520 2.66% 10272 2.09%
   .80 to 84 years 10693 2.10% 7983 1.62%
   .85 years and over 10754 2.12% 5894 1.20%
You may download a small file with 100 records free here. Free Download. Files with 5k, 50K, 250K and 1 million records are available for purchase at
File Name Description Price Buy
dm_mock_person_100.csv 100 mock personal data records. CSV format. free Free Download
dm_mock_person_5k.csv 5K mock personal data records. About 0.7M bytes. CSV format. $2.95  
dm_mock_person_50k.csv 50K mock personal data records. About 7M bytes. CSV format. $7.95  
dm_mock_person_250k.csv 250K mock personal data records. About 35M bytes. CSV format. $9.95  
dm_mock_person_1m.csv 1 million mock personal data records. About 140M bytes. CSV format. $39.95