Large-scale machine learning for financial recommender systems

Estat: Atorgat

Entorn empresarial: BBVA Data & Analytics

Entorn Acadèmic: Universitat de Barcelona -

Municipi: Barcelona

Ambits: PE6 Computer Science and Informatics - PE7 Systems and Communication Engineering -

Titulació requerida:

Descripció del projecte

Deep learning, the study of neural networks with multiple layers and non-linearities, has seen upsurge in recent years thanks to new findings and state-of-the-art performances in fields like computer vision [1], machine translation [2], speech processing [3] and complex game playing [4].

While in recent years the algorithmic aspects have advanced rapidly, the research has mostly concentrated on data that can be transformed easily into matrices or sequences (such as image matrices, real-valued series or sequences of discrete tokens), and on artificial intelligence tasks of perception (vision, reading, listening).

While this is boosting technologies such as self-driving cars or automatic chatbots, it only represents a very small fraction of all the industries currently dealing with massive datasets and in need for artificial intelligence breakthroughs. For example, less efforts have been made in (a) applications of machine learning where the data is more complex, such as multi-relation, multi-attribute, spatio-temporal data, such as data of cities, telecom providers, ecommerce or financial institutions. Here, AI applications which require a reasoning beyond perception, for example recommender systems, forecasting, complex decision-making. “Solving” deep learning for such cases could broaden the impact of AI in society.

We propose to build on recent advances of deep learning to study and develop techniques beyond perception which use complex data. As a use case of complex reasoning we will work with the ability to generate financial advisory for an individual given a history of financial activities provided by the individual. We have massive datasets of financial activities that we believe are unprecedented in previous machine learning literature.

The grand goal of generating financial advisory would require a modeling of massive relational datasets containing multiple entities of different types (e.g. customers, payments, shops, accounts etc), and possible many-to-many relations between them; where the entities or relations can evolve in time, be described by multi-dimensional attributes and may be spatially-bounded. We will start by investigating the non-trivial aspect on how to build deep learning models for:
1. Forecasting expenses and events: Given a history of individual expenses, their relations, and the expenses of other users, accurately forecast the times and impact of expenses or financial events which can be anticipated. Here, we would start building on deep models such as long short-term memory networks, which are obtaining state-of-the-art results for sequential/temporal tasks (e.g. [2]). But we will then extend to construct forecasting models which take into account the “network” of users or taking advantage of multiple correlated signals simultaneously.
2. Uncertainty modeling: Deep learning systems for regression or forecasting provide point estimates but not uncertainty intervals or scores; thus would have no mechanism to assess the uncertainty of a prediction. We will study this problem, propose mechanisms to yield uncertainty scores, and compare to a recent baseline [5]. We will also try to study the interplay between these models and other machine learning models that deal with uncertainty in a principled way, such as Gaussian processes.
3. Data synthesis: While there have been advances in data synthesis using deep learning, with approaches like PixelCNN, WaveNet or Generative Adversarial Networks (see e.g. [6]), this is addressing signals such as images or speech which (despite challenging) are relatively “uniform” (a matrix or time series). Generating data that is relational, contextual to human behavior and external factors, and with long-term dependencies would be more challenging. An application of data generation could be synthesizing personalized data for what-if analysis, to aid in complex decision-taking or for assessing the long-term financial health and providing recommendations.
We also note that the conclusions extracted from this research could be extended to other areas with the same degree of data complexity, such as transport networks or e-commerce.
[1] Krizhevsky et al., ImageNet Classification with Deep Convolutional Neural Networks, NIPS 2012
[2] Sutskever et al., Sequence to Sequence Learning with Neural Networks, NIPS 2014
[3] Hinton et al., Deep neural networks for acoustic modeling in speech recognition. IEEE Signal Processing Magazine, 2012
[4] Silver et al., Mastering the game of Go with deep neural networks and tree search, Nature, 2016
[5] Gal and Gharhamani, Dropout as a Bayesian Approximation: Insights and applications, ICML 2015
[6] Van de Oord et al., Conditional Image Generation with PixelCNN Decoders, NIPS 2016

Tornar a la lista de projectes

Galeta	Durada	Descripció
cookielawinfo-checbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-advertisement	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Advertisement" category .
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
CookieLawInfoConsent	1 year	Records the default button state of the corresponding category & the status of CCPA. It works only in coordination with the primary cookie.
elementor	never	This cookie is used by the website's WordPress theme. It allows the website owner to implement or change the website's content in real-time.
rc::a	never	This cookie is set by the Google recaptcha service to identify bots to protect the website against malicious spam attacks.
rc::c	session	This cookie is set by the Google recaptcha service to identify bots to protect the website against malicious spam attacks.
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
wpEmojiSettingsSupports	session	WordPress sets this cookie when a user interacts with emojis on a WordPress site. It helps determine if the user's browser can display emojis properly.

Galeta	Durada	Descripció
yt-player-headers-readable	never	The yt-player-headers-readable cookie is used by YouTube to store user preferences related to video playback and interface, enhancing the user's viewing experience.
yt-remote-cast-available	session	The yt-remote-cast-available cookie is used to store the user's preferences regarding whether casting is available on their YouTube video player.
yt-remote-cast-installed	session	The yt-remote-cast-installed cookie is used to store the user's video player preferences using embedded YouTube video.
yt-remote-fast-check-period	session	The yt-remote-fast-check-period cookie is used by YouTube to store the user's video player preferences for embedded YouTube videos.
yt-remote-session-app	session	The yt-remote-session-app cookie is used by YouTube to store user preferences and information about the interface of the embedded YouTube video player.
yt-remote-session-name	session	The yt-remote-session-name cookie is used by YouTube to store the user's video player preferences using embedded YouTube video.
ytidb::LAST_RESULT_ENTRY_KEY	never	The cookie ytidb::LAST_RESULT_ENTRY_KEY is used by YouTube to store the last search result entry that was clicked by the user. This information is used to improve the user experience by providing more relevant search results in the future.

Galeta	Durada	Descripció
_ga	2 years	The _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors.
_ga_*	1 year 1 month 4 days	Google Analytics sets this cookie to store and count page views.
_gat_gtag_UA_55600303_1	1 minute	Set by Google to distinguish users.
_gid	1 day	Installed by Google Analytics, _gid cookie stores information on how visitors use a website, while also creating an analytics report of the website's performance. Some of the data that are collected include the number of visitors, their source, and the pages they visit anonymously.
_hjAbsoluteSessionInProgress	30 minutes	Hotjar sets this cookie to detect the first pageview session of a user. This is a True/False flag set by the cookie.
_hjFirstSeen	30 minutes	Hotjar sets this cookie to identify a new user’s first session. It stores a true/false value, indicating whether it was the first time Hotjar saw this user.
_hjIncludedInPageviewSample	2 minutes	Hotjar sets this cookie to know whether a user is included in the data sampling defined by the site's pageview limit.
_hjIncludedInSessionSample	2 minutes	Hotjar sets this cookie to know whether a user is included in the data sampling defined by the site's daily session limit.
_hjSession_*	30 minutes	Hotjar sets this cookie to ensure data from subsequent visits to the same site is attributed to the same user ID, which persists in the Hotjar User ID, which is unique to that site.
_hjSessionUser_*	1 year	Hotjar sets this cookie to ensure data from subsequent visits to the same site is attributed to the same user ID, which persists in the Hotjar User ID, which is unique to that site.
_hjTLDTest	session	To determine the most generic cookie path that has to be used instead of the page hostname, Hotjar sets the _hjTLDTest cookie to store different URL substring alternatives until it fails.
CONSENT	2 years	YouTube sets this cookie via embedded youtube-videos and registers anonymous statistical data.

Galeta	Durada	Descripció
__Host-GAPS	2 years	This cookie allows the website to identify a user and provide enhanced functionality and personalisation.
NID	6 months	Google sets the cookie for advertising purposes; to limit the number of times the user sees an ad, to unwanted mute ads, and to measure the effectiveness of ads.
VISITOR_INFO1_LIVE	5 months 27 days	A cookie set by YouTube to measure bandwidth that determines whether the user gets the new or old player interface.
VISITOR_PRIVACY_METADATA	6 months	YouTube sets this cookie to store the user's cookie consent state for the current domain.
YSC	session	YSC cookie is set by Youtube and is used to track the views of embedded videos on Youtube pages.
yt-remote-connected-devices	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt-remote-device-id	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt.innertube::nextId	never	YouTube sets this cookie to register a unique ID to store data on what videos from YouTube the user has seen.
yt.innertube::requests	never	YouTube sets this cookie to register a unique ID to store data on what videos from YouTube the user has seen.

Galeta	Durada	Descripció
__Secure-ROLLOUT_TOKEN	6 months	Description is currently not available.
_hjIncludedInSessionSample_2950888	2 minutes	Description is currently not available.
_hjSession_2950888	30 minutes	No description
_hjSessionUser_2950888	1 year	No description
BROWNIE	session	Description is currently not available.

Pla de Doctorats Industrials

Pla de Doctorats Industrials

Large-scale machine learning for financial recommender systems

Descripció del projecte

Vols estar al dia del que fem?

Vols estar-ne ben informat O INFORMADA?

Copyright 2026 © Doctorats Industrials de la Generalitat

Descripció del projecte

Vols estar al dia del que fem?

Consentiment